RPKM, FPKM and TPM, Clearly Explained!!!

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 พ.ย. 2024

ความคิดเห็น • 182

  • @statquest
    @statquest  2 ปีที่แล้ว +2

    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @asmitaJ
    @asmitaJ หลายเดือนก่อน +2

    I believe this has become the standard video anyone recommends when you want to understand different types of count normalizations. I have been recommended this by both my supervisor and my professor on two separate occasions haha

    • @statquest
      @statquest  หลายเดือนก่อน +2

      That's awesome! DOUBLE BAM! :)

  • @Anonymous9683
    @Anonymous9683 3 ปีที่แล้ว +10

    I love how this man knows his content is irreplaceable so he can mess around in the intro without being concerned about losing viewers

  • @ivantsers9445
    @ivantsers9445 5 ปีที่แล้ว +38

    thank you very much for explanation! But one thing I should notice: the ORDER of division (i.e. order of steps) doesn't matter. It matters, by WHAT are you dividing for - in TPM it's not just library size (i.e. raw amount of all reads), but all counts of reads, normalized by length (i.e. summary RPK across all genes). This is the root of differences between RPKM and TPM

    • @nicolaikarcher7186
      @nicolaikarcher7186 4 ปีที่แล้ว

      This is correct!

    • @LiptonTiptonTea
      @LiptonTiptonTea 4 ปีที่แล้ว +4

      Couldn't agree more. This video makes the impression that it is changing the order of division that makes the difference, while it's all about total reads vs total normalized counts.

    • @gabriele223
      @gabriele223 2 ปีที่แล้ว

      all count of reads that map onto something i suppose

    • @ejvik3238
      @ejvik3238 3 หลายเดือนก่อน

      but why do we normalize by the length even for TPM?

  • @solidsnake013579
    @solidsnake013579 6 ปีที่แล้ว +7

    hands down the most perfect explanation on the internet

    • @statquest
      @statquest  6 ปีที่แล้ว

      Thank you! :)

  • @Tiago211287
    @Tiago211287 9 ปีที่แล้ว +8

    Most Clear explanation I ever heard of TPM/FPKM/RPKM. Dont know why So many PhD was so confusing in trying to explaning this to me before.

    • @maxfeng4532
      @maxfeng4532 8 ปีที่แล้ว

      +Joshua Starm thanks you so much, I feel like cleaning up the dust piled up in my mind , this is perfect !

  • @efthymiakokkinou1616
    @efthymiakokkinou1616 5 ปีที่แล้ว +50

    this guy is awesome.

    • @statquest
      @statquest  5 ปีที่แล้ว +2

      Thank you! :)

  • @marekglombik8887
    @marekglombik8887 7 ปีที่แล้ว +1

    I've just started my PhD and I'm really glad I found this. Thanks!

  • @tuskofgothos2637
    @tuskofgothos2637 6 ปีที่แล้ว +1

    Your channel is an absolute gem! Please do keep up the good work. We need you!!

  • @de_aquila
    @de_aquila 5 ปีที่แล้ว +2

    Thank you very much for this video! It's really very helpful!
    For many biologists who have the thirst to understand the logic behind why certain metrics are the way they are with respect to statistics... this is certainly of immense help.

  • @TheBlackCarlo
    @TheBlackCarlo 7 ปีที่แล้ว

    My initial work for PhD just got soooooo much easy and fun. Thanks!

  • @MariaSamaloisaMarsa-lw4fk
    @MariaSamaloisaMarsa-lw4fk 8 หลายเดือนก่อน +1

    Terimakasih pak saya sudah menonton TH-cam RPKM ini sangat memberkati saya 🙏🙏
    Dan nama saya adalah Maria Samaloisa semester 4, terimakasih Tuhan Yesus memberkati kita semua 🙏🙏👍

    • @statquest
      @statquest  8 หลายเดือนก่อน

      bam! :)

  • @syednajeebashraf4101
    @syednajeebashraf4101 8 ปีที่แล้ว +3

    I watched this presentation and now I can explain this to even seniors in my place as well !! :)

  • @fabioPatroni
    @fabioPatroni 7 ปีที่แล้ว

    The best and clearest explanation I've ever seen! Tks

  • @torlarsen2212
    @torlarsen2212 2 ปีที่แล้ว +1

    Yet another great explanation StatQuest!!! You keep educating til today!!

  • @louisebuijs3221
    @louisebuijs3221 4 ปีที่แล้ว

    RPKM = Reads per kilobase million -> normalize for read depth (some replicates simply have more read depth, technical)
    - SE RNAseq
    - PE RNAseq = FPKM (rest same)
    1. devide all reads per gene by the total amount of reads per replicate(or sample however you wanna call it)
    2. devide by gene length
    TPM = different order
    1. devide by read length
    2. devide by gene length
    result of the difference in order is that the relative expression of reads is more easily comparable because in TPM the piecharts are all the same size and in RPKM the pies are different size

  • @Pongant
    @Pongant 4 ปีที่แล้ว +1

    I love your low-key intros

  • @emilemagadur3216
    @emilemagadur3216 3 วันที่ผ่านมา +1

    Thank you very much for the explanation ! It's so much clearer now ^^

    • @statquest
      @statquest  2 วันที่ผ่านมา

      Glad it helped!

  • @dreamyagnes
    @dreamyagnes 2 ปีที่แล้ว +1

    Hi Josh, thank you so much for your videos.

    • @statquest
      @statquest  2 ปีที่แล้ว

      Glad you like them!

  • @SNAKE1375
    @SNAKE1375 8 หลายเดือนก่อน +1

    Hi Josh, thanks very much for this again well and clear explained video. It seems that TPM would be the most approrpiate to mseure gene expression between sample. However, internet searches shows the contrary. Some are saying that TMM would be the best solution. What do think of this?

    • @statquest
      @statquest  8 หลายเดือนก่อน +1

      Thank you!

    • @SNAKE1375
      @SNAKE1375 8 หลายเดือนก่อน

      Thanks Josh, so what do you think about TMM instead of TPM?@@statquest

    • @statquest
      @statquest  8 หลายเดือนก่อน +1

      @@SNAKE1375 Unfortunately I haven't been involved with high-throughput sequencing for a long time now, so I don't know the answer.

  • @Qaxoontii
    @Qaxoontii 6 ปีที่แล้ว +2

    Thank you so much for this explanation, it is very useful for us biologist that have no background in bioinformatics.

    • @statquest
      @statquest  6 ปีที่แล้ว

      You're welcome! I'm glad to know that the video is helpful. :)

  • @kanefoster8780
    @kanefoster8780 4 ปีที่แล้ว +2

    this is fantastic. I'm all over this goddam

  • @prachinagpal3112
    @prachinagpal3112 7 ปีที่แล้ว

    Concrete explanation .
    Concepts explained to the point.
    Add more !

  • @KeziKing
    @KeziKing ปีที่แล้ว +1

    This was great!!! You really explained it clearly! Thanks so much!

    • @statquest
      @statquest  ปีที่แล้ว

      Glad it was helpful!

  • @victorcampos9064
    @victorcampos9064 4 ปีที่แล้ว +1

    Thank you so much!! Could not be explained clearer. Keep up the good work!

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      Thank you! :)

  • @Jonix-redhat
    @Jonix-redhat 2 ปีที่แล้ว +1

    Thx for a great and easy explanation!

  • @Rd-lx8tu
    @Rd-lx8tu 3 ปีที่แล้ว +1

    This video is a life saver! Thanks a Million!

  • @lucyyu2251
    @lucyyu2251 9 ปีที่แล้ว +1

    This is very very clear! I wish I've seen this video earlier! Keep it up!

  • @TheLegendOfNiko
    @TheLegendOfNiko 4 ปีที่แล้ว +2

    Perfect explanation, however, one thing was left out - TMM. How does TMM fit into the mix?

    • @statquest
      @statquest  4 ปีที่แล้ว +2

      TMM is similar to what they do in DESeq2. For more details, check out: th-cam.com/video/UFB993xufUU/w-d-xo.html

  • @asiyazhao3820
    @asiyazhao3820 3 ปีที่แล้ว +1

    very clear explanation best ever

  • @mrcoolgs100
    @mrcoolgs100 ปีที่แล้ว +1

    Excellent work!!

  • @guigaolin6825
    @guigaolin6825 3 ปีที่แล้ว +1

    Thanks for the video!
    Btw, a paper titled 'Single-cell RNA sequencing technologies and bioinformatics pipelines' published in 2018 seems to borrow your idea as their Fig.3c and without any citation.
    What do you think of that figure?

    • @statquest
      @statquest  3 ปีที่แล้ว

      You're totally right. Thanks for pointing that out to me.

  • @VenkatNagaraju
    @VenkatNagaraju 4 ปีที่แล้ว +1

    Nice explanation

  • @sambhavmishra1873
    @sambhavmishra1873 5 ปีที่แล้ว +1

    Thank you so much, Josh Starmer !! It was a very clear explanation. My doubts are totally cleared.

    • @statquest
      @statquest  5 ปีที่แล้ว

      Awesome! Thank you. :)

  • @krzysztofkolmus6936
    @krzysztofkolmus6936 6 ปีที่แล้ว +1

    Hi Josh,
    Just a quick question regarding the TPM. What am I supposed to use as TPM input? Is it for the given transcript total transcript length (so exons, introns and UTRs) or just length of exons? Many thanks for help!

    • @statquest
      @statquest  6 ปีที่แล้ว

      It depends on how the sequencing is done. That said, most of the time, introns are spliced out of the transcript and are not sequenced, so you can exclude those from the length of the sequence. One sure way to know you're doing it right is to look at the alignments using a genome browser - then you'll see where the reads are mapping to - if it's just exons or exons + UTRs.

  • @george543
    @george543 8 ปีที่แล้ว

    Thank you for the clear explanation. You made it so straightforward and easy!

  • @blackV199
    @blackV199 2 ปีที่แล้ว

    I have a question, shouldn't we use the effective length rather than transcript length? could you maybe make a video about that?

    • @statquest
      @statquest  2 ปีที่แล้ว

      I'll keep that in mind.

    • @blackV199
      @blackV199 2 ปีที่แล้ว

      @@statquest Apologies, effective lengths could only be calculated when raw data is available (fastq files). Here you discuss processed data (counts data). Regardless, it would be pretty awesome though if you could discuss the data processesing pipeline.

  • @priyankamaripuri8249
    @priyankamaripuri8249 6 ปีที่แล้ว +1

    I find your videos extremely helpful! Thank you so much!!!! Can you share your presentations too?

  • @mrnotsoevil
    @mrnotsoevil 8 ปีที่แล้ว

    Thank you! Finally a nice and easy-to-understand explanation!

  • @tejasgohil9387
    @tejasgohil9387 8 ปีที่แล้ว

    Most Most Useful. I was beating my head to understand these RPKM/FPKM since last 3 days by reading and reading and reading!!! But this 10 min video did it without any confusion. Thank you Very much.

  • @bodhisattwabanerjee8936
    @bodhisattwabanerjee8936 8 ปีที่แล้ว +1

    Wonderful explanation.. So informative, yet explained so easily. Thank you very much. It was indeed a great help.

  • @glorybasumata7555
    @glorybasumata7555 6 ปีที่แล้ว +1

    Awesome! Pretty well explained and coherent.

    • @statquest
      @statquest  6 ปีที่แล้ว +1

      Thanks!!! :)

  • @rayz1408
    @rayz1408 3 ปีที่แล้ว +1

    This is awesome!! Thank you!

    • @statquest
      @statquest  3 ปีที่แล้ว

      Glad you like it!

  • @MrDeking10
    @MrDeking10 5 ปีที่แล้ว

    What are some typical TPM values? I got a lot of zeros in my dataset. However there is a lot of values between 1 and 2, and some as high as 13. Thanks

  • @rojinsafavi797
    @rojinsafavi797 6 ปีที่แล้ว +2

    Would you please elaborate on what length one should use if they have gene count instead of transcript count?

    • @statquest
      @statquest  6 ปีที่แล้ว

      Are you talking about the length of the RNA fragments that are sequenced? I don't think it really matters much either way, however, maybe longer fragments are better for transcript-level counting, since you want the fragments to span exons.

    • @rojinsafavi797
      @rojinsafavi797 6 ปีที่แล้ว +1

      Thanks for your quick reply :-), and yes for example if a gene has multiple isoforms I wonder which isoform length should be used for normalization step. I guess based on what you mentioned the longest isoform length should be use

    • @statquest
      @statquest  6 ปีที่แล้ว

      If you are just counting reads per gene, I think most people use the longest isoform. However, if you are counting reads per transcript, then you just use that transcript’s length.

  • @williammo4450
    @williammo4450 4 ปีที่แล้ว +1

    This guy is amazing! So clear!

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      Thanks! :)

  • @satu272
    @satu272 7 ปีที่แล้ว +1

    So good! Thank you, this really helps with my thesis.

  • @taraeicher4241
    @taraeicher4241 5 ปีที่แล้ว +2

    Great explanation! Thank you!

  • @rodolfoaramayo7392
    @rodolfoaramayo7392 8 ปีที่แล้ว

    Good Job!
    I am going to use this video to explain these concepts in Genomics a Graduate/Undergraduate class I teach at Texas A&M University

  • @george543
    @george543 7 ปีที่แล้ว +1

    Josh, could you help answering a question from me?
    When normalizing to the total read count (the second step of TPM, after normalizing to gene length), is the total read count the sum of normalized read counts that are mapped to genes only? What about the reads that are not annotated? Thanks fro your help!

  • @easyasperl
    @easyasperl 8 ปีที่แล้ว

    So is TPM more like FPKM in the sense that it keeps track of paired end reads?

  • @stemcell1167
    @stemcell1167 8 หลายเดือนก่อน

    Hello! I am supposed to do TPM normalisation of my counts Matrix , can l use steps explained here as it is? Or should l use any tool or package?

    • @statquest
      @statquest  8 หลายเดือนก่อน

      Usually a package will do this for you, but you can also follow these steps.

  • @jamshidkhorashad1998
    @jamshidkhorashad1998 4 ปีที่แล้ว +1

    This was great, thanks

    • @statquest
      @statquest  4 ปีที่แล้ว

      Glad you enjoyed it!

  • @sumitkumar-el3kc
    @sumitkumar-el3kc 4 ปีที่แล้ว

    What sequencing depth really signifies? Does having more sequencing depth mean high expression? Then why normalization for depth is required??

    • @statquest
      @statquest  4 ปีที่แล้ว

      For details on what Sequencing Depth means and why we need to normalize, see: th-cam.com/video/tlf6wYJrwKY/w-d-xo.html

  • @nnzhou9493
    @nnzhou9493 4 ปีที่แล้ว

    Hey Josh, I used DEseq2 got the significant differential expression gene list. Then I checked the TPM of those genes. some genes' TPM are quite low ( < 1), some are quite high (hundreds or thousands ). should I use TPM cut-off value to filter the low-expression genes? If I have to do this, which cut-off value you prefer? Welcome to any suggestion. Thank you!

    • @statquest
      @statquest  4 ปีที่แล้ว

      DESeq2 should do this filtering for you. For more details, see: th-cam.com/video/Gi0JdrxRq5s/w-d-xo.html

  • @Adelphos0101
    @Adelphos0101 4 ปีที่แล้ว +1

    Excelent video!

  • @lilhedayat
    @lilhedayat 4 ปีที่แล้ว

    why is it that longer genes will have more reads mapping to them? are longer genes more amplified or is it because the short fragment of reads can be mismapped?

    • @statquest
      @statquest  4 ปีที่แล้ว +2

      Imagine I have mRNA transcripts for two different genes, Gene A and Gene B. The mRNA transcripts for Gene A are 300 bp long and the mRNA transcripts for Gene B are 900 bp long. Now, since the sequencer can only sequence 300 bp long fragments, I break all of the mRNA fragments in to pieces that are 300bp long. That means for each mRNA transcript for Gene A, we get one 300bp long fragment to sequence. For Gene B, we get 3 fragments to sequence. In other words, we will sequence 3 times as many fragments for every mRNA transcript from Gene B than from Gene A. Does that make sense?

    • @lilhedayat
      @lilhedayat 4 ปีที่แล้ว +1

      @@statquest it absolutely does!!! thankyou so much for explaining, I completely missed that! I always assumed that you would correct for this. I was under the assumption that, not the fragment, but the entire 900bp would count as 1 count by default.

  • @王吉-q4k
    @王吉-q4k 4 ปีที่แล้ว +1

    Thumb up every video

    • @statquest
      @statquest  4 ปีที่แล้ว

      Thank you! :)

  • @Eduardrssl
    @Eduardrssl 4 ปีที่แล้ว +1

    Very nice vid!! Thanks!

    • @statquest
      @statquest  4 ปีที่แล้ว

      Thank you! :)

  • @yanggao8840
    @yanggao8840 5 ปีที่แล้ว +1

    very helpful, thanks very much

  • @lloydy272
    @lloydy272 8 ปีที่แล้ว

    Thanks for explaining this in a way I can understand. My only question, how do people manage with R/FPKM if it is so hard to compare between reps?

    • @maxfeng4532
      @maxfeng4532 7 ปีที่แล้ว

      Hey Joshua, thank you for the great video. Could you please explain why normalized counts are not for statistical test? the absolute values are changed by normalization but the ranks or the relative expression has not been changed... Is it because of isoforms? Thank you!

  • @fmetaller
    @fmetaller 6 ปีที่แล้ว +1

    First I want to thank you for this great explanation.
    There is a point I'm missing. All these normalization techniques assume that each type of cell analyzed is producing the same amount of RNA and all the difference we see are due to some variability in the depth of the sequencing. But is this true? Shouldn't be a better idea to normalize the count only on some housekeeping genes like we do with qPCR?

    • @statquest
      @statquest  6 ปีที่แล้ว +1

      This is a great question. The reality is that when you do statistics on RNA-seq data, the normalization methods often use housekeeping genes. I explain how these normalization methods work in these videos: th-cam.com/video/UFB993xufUU/w-d-xo.html and th-cam.com/video/Wdt6jdi-NQo/w-d-xo.html

    • @fmetaller
      @fmetaller 6 ปีที่แล้ว +1

      Oh thank for the answer(s)

  • @johnswenson6699
    @johnswenson6699 7 ปีที่แล้ว

    Hey Joshua,
    Thanks so much for this video. I've a follow-up question: suppose I want to compare relative expression levels of gene A between two samples, but the tissue samples vary in size ... do these normalization methods take into account the fact that some samples will have more genes present than others?
    As a hypothetical (but easy to visualize) example, suppose I cut off a hand, ground it up, and sequenced the RNA. This is sample 1. For sample 2, I cut off a different hand AND the attached arm, ground them all up, and sequenced the RNA. If I expected gene A expression only in the fingertips, would I be able to compare the two samples to uncover which sample had more expression of gene A, even though sample 2 had more (and more diverse) input tissue than sample 1?
    In short, is a there a normalization method that accounts for the fact that there may simply be a greater variety of genes being expressed in one sample relative to another?
    Thanks again for this video. You explained these concepts better than any other source I've found!

    • @johnswenson6699
      @johnswenson6699 7 ปีที่แล้ว

      Brilliant.I didn't realize those programs included that kind of normalization ... Thanks a lot, sir. I'm going to watch those videos pronto!

  • @rollieize
    @rollieize 9 ปีที่แล้ว

    nicely explained!

  • @carlagibbs3223
    @carlagibbs3223 5 ปีที่แล้ว +1

    Excellent

  • @arpitachoudhury9788
    @arpitachoudhury9788 4 ปีที่แล้ว

    Can you please make a detailed video on how limma+voom works

    • @statquest
      @statquest  4 ปีที่แล้ว +1

      I'll keep it in mind.

  • @TheBloodyBeat
    @TheBloodyBeat 6 ปีที่แล้ว +1

    Thanks for the awesome video ! If I understood well, none of these metrics takes into account the amount of unmapped reads. So does comparing TPM across samples that aren't replicates (e.g. a few environmental metagenomes) make any sense ?

    • @statquest
      @statquest  6 ปีที่แล้ว +1

      You make a very good point. To be honest, TPM, FPKM and RPKM etc are all just for connivence - they may the data easy to look at and get a general feel for. However, they are not used for any sort of "real" comparisons among samples. For example, DESeq2 and EdgeR2 (and pretty much any other software that looks for differences between sets of "seq" samples) use completely different normalization strategies. These methods take into account that different samples might express different sets of genes - and some samples might not have many reads over all etc. So, my advice, is to use edgeR or DESeq2 to normalize your data for you, rather than doing it by hand. I have videos that show how normalization works in EdgeR: th-cam.com/video/Wdt6jdi-NQo/w-d-xo.html and DESeq2: th-cam.com/video/UFB993xufUU/w-d-xo.html if you would like more information.

    • @TheBloodyBeat
      @TheBloodyBeat 6 ปีที่แล้ว +1

      @@statquest Hi Josh, thanks a lot for your very helpful answer. I just watched your DeSeq2 video and it looks indeed a lot closer to what I'm looking for than the TPM/RPKM/FPKM metrics. I'll dive into the details and try it on my data.

    • @statquest
      @statquest  6 ปีที่แล้ว

      @@TheBloodyBeat Hooray! :)

  • @LGARCIA20504
    @LGARCIA20504 5 ปีที่แล้ว

    Very good man!

  • @leixiao169
    @leixiao169 3 ปีที่แล้ว

    Great lecture. Thanks StatQuest! I wonder if Deseq2 automatically normalizes counts based on FKPM or TPM?

    • @statquest
      @statquest  3 ปีที่แล้ว +1

      For details on how DESeq2 normalizes reads, see: th-cam.com/video/UFB993xufUU/w-d-xo.html

    • @leixiao169
      @leixiao169 3 ปีที่แล้ว +1

      @@statquest thanks!

  • @steffimatchado8442
    @steffimatchado8442 4 ปีที่แล้ว

    Thanks for the very explanatory video. It is really helpful for students like me. Could you please post a video on N50 values and these will be used to evaluate the assembly ??

  • @明坤宋
    @明坤宋 3 ปีที่แล้ว

    Hi, your video is very helpful! But if I only have the log2RPM data, how can I find the differentially expressed genes? Is there anyway to transfer the log2RPM data to count data?

    • @statquest
      @statquest  3 ปีที่แล้ว

      Not that I know of.

  • @areeniiitd
    @areeniiitd 7 หลายเดือนก่อน +1

    great video ngl.

    • @statquest
      @statquest  7 หลายเดือนก่อน

      Thanks!

  • @biotechsampath
    @biotechsampath 7 ปีที่แล้ว

    awesome explanation....thanks

  • @RonaldCutler
    @RonaldCutler 7 หลายเดือนก่อน

    Now you should make a video of why you can’t use these to compare genes between samples and only to compare genes to each other within a sample. Since TPM is a proportion, if one gene goes up in a sample, then the rest of the gene will seem like they are going down, when in reality they really might be at the same level!

    • @statquest
      @statquest  7 หลายเดือนก่อน

      I'll keep that in mind.

  • @sanjaisrao484
    @sanjaisrao484 2 ปีที่แล้ว +1

    Thanks

  • @elzedliew972
    @elzedliew972 3 ปีที่แล้ว +1

    statquest is an encyclopedia of ...

  • @krzysztofkolmus6936
    @krzysztofkolmus6936 6 ปีที่แล้ว

    Great video! Can anyone recommend an R package for TPM normalisation? Thanks a lot in advance!

  • @zekihi6994
    @zekihi6994 7 ปีที่แล้ว

    so good! Thanks.

  • @reafdaw01
    @reafdaw01 7 ปีที่แล้ว

    You are pretty awesome! Thanks.

  • @eldorado.t
    @eldorado.t 4 ปีที่แล้ว +1

    Awesome 😍 thanks

  • @pythonsun996
    @pythonsun996 6 ปีที่แล้ว +2

    very good!

    • @statquest
      @statquest  6 ปีที่แล้ว

      Thank you! :)

  • @ejvik3238
    @ejvik3238 3 หลายเดือนก่อน

    For the TPM, why do we normalize by the gene length?

    • @statquest
      @statquest  3 หลายเดือนก่อน

      Because the number of reads per gene scales by the length of the gene.

    • @ejvik3238
      @ejvik3238 3 หลายเดือนก่อน

      @@statquest Even if I do transcriptome from a sample and I'm interested in how much or how little (if at all) are genes expressed?

    • @statquest
      @statquest  3 หลายเดือนก่อน

      @@ejvik3238 yep

    • @ejvik3238
      @ejvik3238 3 หลายเดือนก่อน

      @@statquest I just watched one of your videos called "StatQuest: A gentle introduction to RNA-seq" so if I understand that correctly we have to divide by the gene length because we create fragments from the RNA to 200 - 300 bp to be able to even start sequencing. If so my question would be why don't we divide by the number of fragments instead?

    • @statquest
      @statquest  3 หลายเดือนก่อน

      @@ejvik3238 The number of reads per gene is a function of the gene's length (because a 1kb long gene will create 5 200bp fragments and a 2kb gene will create 10) and its expression level. By dividing by the length, we can then determine expression level, which is what we are interested in.

  • @tinacole1450
    @tinacole1450 3 ปีที่แล้ว +1

    love it...

    • @tinacole1450
      @tinacole1450 3 ปีที่แล้ว +1

      even the corny songs.... because I know something good follows

    • @statquest
      @statquest  3 ปีที่แล้ว

      Thank you very much! :)

  • @尼安德鲁-n6j
    @尼安德鲁-n6j 9 ปีที่แล้ว

    Nice!

  • @shichengguo8064
    @shichengguo8064 4 ปีที่แล้ว

    Well explained, but I don't agree that TPM is better than FPKM

  • @anjalipatni2580
    @anjalipatni2580 3 ปีที่แล้ว

    Sir,
    My data do not have any replicates and it is a paired end data.

  • @km2052
    @km2052 4 ปีที่แล้ว

    thx

  • @omarmohammadibrahim2197
    @omarmohammadibrahim2197 6 ปีที่แล้ว

    the sarting felt like ppap song :P
    but everything after that was awesome :D

  • @IsaacXinPei
    @IsaacXinPei 5 ปีที่แล้ว

    Does the title has a typo? TPM => FPM?

    • @statquest
      @statquest  5 ปีที่แล้ว

      I don't think there is a typo. The title is: "StatQuest: RPKM, FPKM and TPM". RPKM, FPKM and TPM are three (3) different ways to normalize high-throughput sequencing data.

    • @IsaacXinPei
      @IsaacXinPei 5 ปีที่แล้ว +1

      @@statquest that's right, the first slide in the video says FPM, I think the slide has a typo

    • @statquest
      @statquest  5 ปีที่แล้ว

      Ah! You are correct! That's amazing. This video has been online for 4 years and you are the first person to spot that.

    • @IsaacXinPei
      @IsaacXinPei 5 ปีที่แล้ว +1

      @@statquest no problem at all, the videos are very useful, thank you for all the hard work!

  • @joshua20199
    @joshua20199 หลายเดือนก่อน

    Why isn't it TPKM? :/

    • @statquest
      @statquest  หลายเดือนก่อน

      No idea!

  • @MBCOUGER
    @MBCOUGER 8 ปีที่แล้ว +1

    Thank you so much for this, I now no longer look like this when trying to explain this: imgur.com/gallery/iWKad22

  • @fmetaller
    @fmetaller 6 ปีที่แล้ว +1

  • @hypno666pl
    @hypno666pl 5 ปีที่แล้ว +1