Testing for significance with microbiome data on individual taxa using R (CC122)

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ก.ค. 2024
  • Testing for significance across microbial taxa is a critical tool for analyzing microbiome data. Pat will show how he uses the wilcox.test with tidy data to compare the relative abundance of bacterial taxa (e.g. genus, OTUs, ASVs). After correcting for multiple comparisons using p.adjust, he shows how he would visualize the relative abundance of significant taxa across individuals along with an indicator of the median and intraquartile range.
    In this episode, Pat will use data handling functions from the tidyverse including #nest, unnest, map, and #tidy as well as #wilcox.test in RStudio. The accompanying blog post can be found at www.riffomonas.org/code_club/....
    If you're interested in taking an upcoming 3 day R workshop, email me at riffomonas@gmail.com!
    R: r-project.org
    RStudio: rstudio.com
    Raw data: github.com/riffomonas/raw_dat...
    Workshops: www.mothur.org/wiki/workshops
    You can also find complete tutorials for learning R with the tidyverse using...
    Microbial ecology data: www.riffomonas.org/minimalR/
    General data: www.riffomonas.org/generalR/
    0:00 Introduction
    5:19 Testing for significance by SRN status
    14:26 Visualizing differences in relative abundance
    23:33 Improving appearance of figure
    28:35 Recap
  • วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 45

  • @Riffomonas
    @Riffomonas  3 ปีที่แล้ว +1

    Have you used the set of map functions in the past? What questions do you have about these functions?

  • @revmohamed
    @revmohamed 8 หลายเดือนก่อน

    Thanks a lot for all your videos - very helpful!!

  • @aleonflux1138
    @aleonflux1138 3 ปีที่แล้ว +3

    Another great tutorial, Pat. Your comment re the non (or less-than-ideal) applicability of a reductionist approach to microbiome analysis gave me the inspiration I needed for an up-coming lab group presentation.

    • @Riffomonas
      @Riffomonas  3 ปีที่แล้ว

      Lol - awesome! I hope I’m not getting you into trouble 😂. Let us know how it goes

    • @aleonflux1138
      @aleonflux1138 3 ปีที่แล้ว +1

      @@Riffomonas My presentation went really well :) Describing the difficulty of trying to boil down the bacteria involved in diseases to a single (or a few) candidate(s) led to a great discussion on the value of multi-omics approaches for assessing how whole bacterial communities behave.

    • @Riffomonas
      @Riffomonas  3 ปีที่แล้ว

      @@aleonflux1138 awesome! glad it went well

  • @nicola84palm
    @nicola84palm 2 ปีที่แล้ว +1

    I am a bioinformatician and lover of the tidyverse and your videos are excellent!!

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      Thanks Nicola! I’m glad you found the channel 🤓

  • @liliabkar
    @liliabkar ปีที่แล้ว

    Your videos are amazing!!!!

  • @soyeonkim9355
    @soyeonkim9355 ปีที่แล้ว +1

    Best of best tutorial. thank you so much!!

    • @Riffomonas
      @Riffomonas  ปีที่แล้ว

      You're very welcome! Thanks for watching

  • @signomar
    @signomar 2 ปีที่แล้ว +1

    You are a person from which I can learn

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      Hey Marco! Thanks - keep watching and let me know if there are any concepts you'd like me to cover ☺

  • @saracorreagarcia
    @saracorreagarcia 2 ปีที่แล้ว +1

    Great tutorial, thanks!

  • @atsrajib
    @atsrajib 3 ปีที่แล้ว +1

    Thanks 🙏. Helping a lot

    • @Riffomonas
      @Riffomonas  3 ปีที่แล้ว

      Wonderful- thanks for tuning in!

  • @wmavila_14
    @wmavila_14 11 หลายเดือนก่อน

    Hey Pat! Thanks for this awesome video! I'm trying to identify significant differences in genera among mice from four different groups. At 6:37, you mentioned another video on the Schubert dataset with three groups, which seems relevant to my analysis. Could you kindly share the link to that episode? Much appreciated!

  • @bellatsachidou3992
    @bellatsachidou3992 ปีที่แล้ว

    Thank you so much! What is the equivalent for testing for significance of individual taxa in a phyloseq object? For some reason it seems like too much trouble shuffling around types of data. Also, do you have or plan on doing a SIMPER analysis tutorial? Thanks again!

  • @mikhaeldito
    @mikhaeldito 3 ปีที่แล้ว +1

    Thanks! I LOVE your content. Would you be interested in sharing your top tips in QC- and preprocessing microbiome data? Honestly, I am still confused whether I should rarefy my 16S data or not.

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      Sure - be sure to check out the MiSeq SOP at mothur.org for how we do things. And yes, please rarefy :)

  • @rupalhatkar4695
    @rupalhatkar4695 3 ปีที่แล้ว +1

    The videos are super useful!! Could you please also post videos of cancer genomics data analysis? For example, analyzing and making copy number plots, structural variants (i.e. circos plot), etc? Thank you!!!

    • @Riffomonas
      @Riffomonas  3 ปีที่แล้ว

      Unfortunately that’s pretty far outside my area of expertise. But hopefully the concepts I cover will be generalizable enough that they continue to be useful

  • @louisl7245
    @louisl7245 9 หลายเดือนก่อน

    thanks

  • @betzabeatencio5777
    @betzabeatencio5777 2 ปีที่แล้ว +1

    Great video once again thanks! Is there any video where you show how to make a bubble plot with relative abundance of OTU as the size of bubbles. Like in this graph to have in y axis the level of taxa? Thanks I am learning a lot with your videos!

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว +1

      I don’t but you could always map abundance to size within the aes function. Not sure how good that going to look though given the wide variation in abundances within a community

  • @tiberiusjimbo9176
    @tiberiusjimbo9176 2 ปีที่แล้ว +1

    Great content and tutorials. Enjoyed every part of this video. I wonder how I can visualize a number of different plant lifeforms from two forest types (Primary and Secondary) across different elevational gradients. What would be the appropriate test to validate these patterns of diversity within these lifeforms. Any feedback or examples would be much appreciated. Thanks again.

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      Thanks, Tiberius! Unfortunately, I'm afraid your application is too far outside of my area of expertise to give you a good answer

  • @sanujaahammu6747
    @sanujaahammu6747 2 ปีที่แล้ว +1

    Hi, Thank you very much for the code. I am new to R,. If I want to add transparent box plots with whiskers to the plot, which code do I need to add? Thanks

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      Try the stat_summary function. I have a few episodes using it that were released around this one. Thanks for watching!🤓

    • @sanujaahammu6747
      @sanujaahammu6747 2 ปีที่แล้ว

      @@Riffomonas thanks...👍

  • @bridget9926
    @bridget9926 2 ปีที่แล้ว +1

    Hi Pat, I'm confused as to when I should adjust p-value... Based on your video I should use it when making multiple comparisons. Does this mean I should also use adjust my p-values when looking at alpha diversity between two groups? Thanks.

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว +1

      You correct for multiple comparisons when you are repeating a test for comparisons that are not independent. If you are comparing Shannon between two groups there’s no need for a correction

  • @brindangnanampownraj1301
    @brindangnanampownraj1301 2 ปีที่แล้ว +1

    its really good. can you provide your input data to reproduce the same or for practicing?

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      Thanks for watching! In the video description you'll find a link to the blog post for the episode (riffomonas.org/code_club/2021-07-02-wilcox.test). At the bottom of the page you'll find instructions on getting set up. If you don't want to mess with using git, you can get the raw data by going to github.com/riffomonas/mikropml_demo

    • @brindangnanampownraj1301
      @brindangnanampownraj1301 2 ปีที่แล้ว

      @@Riffomonas Thank you🙂

  • @samadhigunathunga2597
    @samadhigunathunga2597 2 ปีที่แล้ว +1

    Hi Pat,
    I really enjoy your videos and I have learned a lot in the past few weeks. I have a question. Is it always necessary to do the correction for multiple comparisons? In my data, (16s rRNA gene sequences for soil) I get significant genera between samples, however, I get none after doing the correction for multiple comparisons. Could there be false negatives after correction? what do you suggest I should do? Thanks in advance..

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      You do need to correct for multiple comparisons. You might start by subsampling your data to a common read depth and screening out those OTUs whose relative abundance are too low to be interesting and run the test with those.

    • @jtaown
      @jtaown 2 ปีที่แล้ว +1

      ​@@Riffomonas and @samadhi gunathunga When screening out the OTUs with low relative abundance how do you do this? I try, for example, using filter(rel_abund > 0.0001) when making the composite but then it does something funny to the data frame and doesn't allow the stat tests to work (10:25 in this video) - while the stat test did work before removing the low rel_abund. might be a bit complicated for a comment message :|
      and thanks for these incredible R videos - good for more than mothur data

    • @Riffomonas
      @Riffomonas  2 ปีที่แล้ว

      @@jtaown That's the general idea, you'd want to filter out the entire taxa, not just from those samples where it is rare but all of the samples.

  • @abdimicro
    @abdimicro ปีที่แล้ว

    I used my own data and my p adjusted value returned all the same p.adjust values. is that a normal? I reviewed the code several times and I don't see any mistake I have done. Thank you!

  • @MinimalistCookOfficial
    @MinimalistCookOfficial ปีที่แล้ว

    Hi, I have the data set where one column is labeled as "Species" and the other two columns is labeled with the sample name "S1" and "S2". Here we want to compare (Wilcox test) the number of species count that is significantly different between the two samples. Can you please help us with the code? We searched for the data you used in this episode but could not find it. Please help us.

    • @Riffomonas
      @Riffomonas  ปีที่แล้ว

      Thanks for watching - if you only have one replicate for each group then you can't use statistical analysis to compare the relative abundance of each species. You can find the data I used in this episode here: github.com/riffomonas/minimalR-raw_data

  • @belgarath73g
    @belgarath73g 4 หลายเดือนก่อน

    why don't use log1p

    • @Riffomonas
      @Riffomonas  3 หลายเดือนก่อน

      To be honest, because I didn't know it existed! Thanks for sharing 🤓