How to download sequencing data from SRA NCBI | Bioinformatics 101

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ส.ค. 2024
  • This is a basic hands-on tutorial to download sequencing data from SRA NCBI using SRA Toolkit.
    In this video, I have demonstrated how to download and configure SRA Toolkit and download sequencing data associated with GSE183947.
    Link to GSE183947:
    www.ncbi.nlm.nih.gov/geo/quer...
    Link to download SRA Toolkit :
    github.com/ncbi/sra-tools/wik...
    Link to install SRA Toolkit:
    github.com/ncbi/sra-tools/wik...
    Link to configure SRA Toolkit:
    github.com/ncbi/sra-tools/wik...
    Link to additional resources:
    1. www.ncbi.nlm.nih.gov/sra/docs...
    2. www.ncbi.nlm.nih.gov/sra/docs...
    Chapters:
    0:00 Intro
    2:19 Get SRR# ids
    5:34 Download SRA Toolkit
    7:50 Configure SRA Toolkit
    10:00 Download fastq files
    Show your support and encouragement by buying me a coffee:
    www.buymeacoffee.com/bioinfor...
    To get in touch:
    Website: bioinformagician.org/
    Github: github.com/kpatel427
    Email: khushbu_p@hotmail.com
    #bioinformagician #bioinformatics #sra #ncbi #genomics #beginners #hands-on #tutorial #howto #omics #research #biology #GEO #rnaseq #ngs

ความคิดเห็น • 74

  • @viniciussferreira
    @viniciussferreira ปีที่แล้ว +2

    Thank you so much for such a thorough video, well done! :)

  • @alinapadurari6001
    @alinapadurari6001 8 หลายเดือนก่อน +2

    Amazing channel! Thank you! You are helping me a lot with my dissertation 🎉

  • @user-re8jg8ep6n
    @user-re8jg8ep6n 9 หลายเดือนก่อน +2

    Thank you so much! This tutorial is very easy to understand for me. And very usefull for beginner.

  • @sant0411
    @sant0411 5 หลายเดือนก่อน

    Your videos are saving my thesis ty so much!

  • @user-ec2sz8pu9v
    @user-ec2sz8pu9v ปีที่แล้ว +1

    Thank you so much for such a great explaination! 😁

  • @animatedbiologywitharpan
    @animatedbiologywitharpan 2 ปีที่แล้ว +1

    really useful

  • @tushardhyani3931
    @tushardhyani3931 2 ปีที่แล้ว

    Thank you for this video !!

  • @jagjotarora1369
    @jagjotarora1369 2 หลายเดือนก่อน

    really helpful video. Such useful and amazing content.

  • @zamanUSAlife
    @zamanUSAlife 2 ปีที่แล้ว +2

    Hello, I am from South Korea, and I appreciate all of your lectures, which are quite helpful. Can you kindly make a short video soon for R language beginners like me? How to install R software and packages, and if, the error shows during package installation, then how to solve them. Thank you so much.

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +3

      I will surely consider making a video covering basics of R and installing packages in R :)

  • @nimblent
    @nimblent 8 วันที่ผ่านมา

    Thank you so much for this tutorial!

  • @heatherpeng7437
    @heatherpeng7437 ปีที่แล้ว +3

    Hello ! Thank you for this useful material!! Could I please squeeze in to ask why did my Bash returned "Segmentation Fault"? Thank you so much!

  • @dej09
    @dej09 ปีที่แล้ว

    Great video, thank you! I hope you can make a tutorial on how to batch download sequences from SRA.

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว

      Sure, I will make a video on it. Thanks for the suggestion.

    • @AxomeV10
      @AxomeV10 ปีที่แล้ว

      @@Bioinformagician hi, is there a video on how to batch download sequences from SRA? i dont see it on your channel?

  • @saadzaheer5773
    @saadzaheer5773 ปีที่แล้ว +1

    Hi there. Thank you for the nice tutorial. Just wanted to ask:
    1. is there a way to select a target folder where the fastq files will be download directly? because by default they are downloaded to home directory
    2. can we download the files in compressed form (.gz format)?
    Thank you.

  • @tolga1292
    @tolga1292 ปีที่แล้ว

    thanks! But I am still puzzled on the point of how many sra files one should download. So lets say i want to benchmark tools on heart tissue for tabula muris. Do i have to download every .sra file which is almost 2 Terrabyte? Is there no information about cell types in the organ/tissue ?

  • @vahidgorganli8895
    @vahidgorganli8895 ปีที่แล้ว

    thank you🙂

  • @PremanandAThambiAnnan
    @PremanandAThambiAnnan 2 ปีที่แล้ว

    Thank you

  • @hk5safe887
    @hk5safe887 2 ปีที่แล้ว

    👍may I know how to check the download result if my broadband connection is not very stable? Thx

  • @felipenunezvillena2141
    @felipenunezvillena2141 ปีที่แล้ว +1

    Hi. First of all. I would like to congratulate you because all the content you are disseminating is very helpful :).
    Regarding the video, I have a few questions.
    As you know, it is possible to find multiple runs (SRR id) for one experiment (SRX).
    I would like to ask you what to do when this occurs. I have read when multiple SRR are found within an SRX entry, runs should be concatenated to finally produce 1 SRR run per SRX entry. Is that correct?
    Best regards,
    Felipe

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว +1

      Each SRR is a separate sample/replicate. You should not merge SRR IDs into one. One experiment (SRX) can have multiple samples and/or replicates and hence multiple SRRs.

  • @stanyang4321
    @stanyang4321 ปีที่แล้ว

    So, what's the next step using the same data we downloaded. How to merge this two fastq files for mapping to to references ?

  • @henryren2790
    @henryren2790 7 วันที่ผ่านมา

    Hi, great video. Quick question: can the SRA toolkit downloaded fastq data be directly used for STAR alignment? Will the file name and header cause any errors? thanks!

  • @swarupdas8403
    @swarupdas8403 2 ปีที่แล้ว +5

    Didi please upload more videos on rna seq data analysis

  • @grsbiosciences
    @grsbiosciences 2 ปีที่แล้ว +1

    Nice explanation madam, how this sra data useful madam

  • @desaishailesh3527
    @desaishailesh3527 ปีที่แล้ว +2

    AFTER DOING ALL LIKE YOU SAID, WHEN I DO FINAL STEP FOR DOWNLOAD ITS SHOW SYSTEM COULD NOT FIND PATH
    PLEASE HELP ME

  • @mrinalsubash8358
    @mrinalsubash8358 ปีที่แล้ว +3

    Hi! Loved the lecture! Very concise and informative in a short period of time. Although, I have been facing one trouble. I have not been able to run vdb-config when I use the command ./vdb-config -i because my MacOS says , " vdb-config.3.0.5” cannot be opened because the developer cannot be verified."
    How do I resolve this issue so that I can successfully run the sra -toolkit across the SRR accession IDs?

    • @jaoverst
      @jaoverst 7 หลายเดือนก่อน +1

      This may be too late to help, but this is how i fixed the issue. What I did first was to change the permissions on vdb-config.3.0.10 by using the following command: chmod 755 vdb-config.3.0.10 in the terminal. If your mac is still giving you a privacy error, then go to system settings>privacy & security and scroll all the way down to find the security section. You should see the name of the file and give it permission from the system settings to run. It should run the next time. I hope this helps.

    • @kel19961
      @kel19961 3 หลายเดือนก่อน +1

      Was able to solve this issue with the following way:
      1. try to follow along the tutorial, you will run in into the issue you are describing.
      2. Open Settings>Privacy & Security. Scroll down.
      3. Find the section under 'Security', where you will see an option for 'Open vbd-config.3.1.0 anyway', there is thould ask you for your password or fingerprint and start to work, next time you type up the command in terminal ;)
      i know this is mad late, but i hope it helps!

  • @tankkar9995
    @tankkar9995 2 ปีที่แล้ว

    I am still not sure how to find the latest uploaded data on SRA….

  • @AyrodsGamgam
    @AyrodsGamgam ปีที่แล้ว +3

    why have thy made SRA download so complicated? It should be simple, why all the hassle? why one has to go thru the terminal?

  • @rishabhjaiswal9843
    @rishabhjaiswal9843 3 หลายเดือนก่อน

    How do we get GSE Id and can i download SRA tool kit in my phone ?

  • @chickenkorma3163
    @chickenkorma3163 6 หลายเดือนก่อน

    Just a hint: It is recommended to use the --split-3 option instead of the --split-files. It deals with reads that do not have a mate and writes them to a third file.

  • @freezingtolerance7493
    @freezingtolerance7493 ปีที่แล้ว

    Hello, thank you for providing this video. I just wonder if I need to make a linux environment to excute SRA toolkit.

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว

      Not necessarily, it can be used in other OS systems as well (github.com/ncbi/sra-tools/wiki/02.-Installing-SRA-Toolkit). However linux is preferred and if often hassle free,

  • @rajathkumarp853
    @rajathkumarp853 ปีที่แล้ว

    I need to split 20 files in windows , could u please help me with commands

  • @peluzaurioraje
    @peluzaurioraje 2 ปีที่แล้ว +2

    Nice video, Can you show us how to make it with an accession list? Thanks.

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      You could loop through each line in your accession list file in bash. It shouldn't be difficult.

    • @damas1989
      @damas1989 ปีที่แล้ว

      @@Bioinformagician Is it possible to download all data in the accession list at the same time?

  • @SamipSapkota-zg8hy
    @SamipSapkota-zg8hy หลายเดือนก่อน +1

    yo sister can you make a video on raw data processing after we download from ncbi??????

  • @user-jf3th9gq1k
    @user-jf3th9gq1k 4 หลายเดือนก่อน

    hello, pl prepare video of metadata file prepare in R for gene expression analysis.

  • @healthnut4936
    @healthnut4936 ปีที่แล้ว +1

    Does this work for single cell RNA/ATAC seq as well?

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว +1

      If the sequencing reads have been deposited in SRA, then I don't see why not.

  • @nicholeleach9888
    @nicholeleach9888 6 หลายเดือนก่อน +1

    Thank you! Everything worked for me- but I have almost 3000 SRR ID's that need to be downloaded. In this case, do you know what command I need to use to get the entire file downloaded instead of just one individual?

    • @julkajulka6751
      @julkajulka6751 หลายเดือนก่อน

      I'm also looking for a way to so this. Did you find a solution?

    • @kimayatekade5267
      @kimayatekade5267 หลายเดือนก่อน

      @@julkajulka6751 Hey could you figure this out? I am also looking for the same

  • @stemcell1167
    @stemcell1167 ปีที่แล้ว

    Hi!
    There's a query
    As you explained meaning of all prefixes used in accession numbers ,in continuation of this i want to know what is the meaning of prefix ERX...

    • @Bioinformagician
      @Bioinformagician  ปีที่แล้ว

      Check this out - www.ncbi.nlm.nih.gov/sra/docs/srasearch/

  • @chrisdoan3210
    @chrisdoan3210 2 ปีที่แล้ว +1

    Thank you for your video! When I run ./vdb-config -i I got this error “vdb-config.3.0.0” cannot be opened because the developer cannot be verified. Would you please tell me how to fix this?

    • @recepuyar6423
      @recepuyar6423 ปีที่แล้ว

      Hİ!
      I take same error. Did you solve this ?

    • @chrisdoan3210
      @chrisdoan3210 ปีที่แล้ว +1

      @@recepuyar6423 I remember go to setting and allow system to run this software.

  • @recepuyar6423
    @recepuyar6423 ปีที่แล้ว

    hank you for your video ! When I run ./vdb-config -i I got this error “vdb-config.3.0.0” cannot be opened because the developer cannot be verified. Would you please tell me how to fix this?

    • @viniciussferreira
      @viniciussferreira ปีที่แล้ว

      Go to System settings, Privacy and Security, there will be a pop-up asking for permission to open the file!

  • @sanjaisrao484
    @sanjaisrao484 2 ปีที่แล้ว

    This split file command is only for paired end sequence?

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      Yes, for single ended reads, we don't use --split-files option

    • @sanjaisrao484
      @sanjaisrao484 2 ปีที่แล้ว +1

      @@Bioinformagician thanks

  • @raushnichoudhary2382
    @raushnichoudhary2382 2 ปีที่แล้ว +2

    Fasterq-dump --split-files command not found. What should I do?

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      Are you running this command within the bin/ folder or sra-toolkit?

    • @nayeemanushrat3174
      @nayeemanushrat3174 2 ปีที่แล้ว +1

      @@Bioinformagician Fasterq-dump --split-files command not working, kindly help please

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      @@nayeemanushrat3174 Are you running this command within the bin/ folder or sra-toolkit?

    • @nayeemanushrat3174
      @nayeemanushrat3174 2 ปีที่แล้ว +1

      @@Bioinformagician I ran this exactly you showed in the video, still not working 😮‍💨

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +1

      ​@@nayeemanushrat3174 The error is telling you that it cannot find the executable fasterq-dump.
      Can you send me a screenshot of the output of ls() where you are trying to run this command, on my email?

  • @anguscampbell3020
    @anguscampbell3020 2 ปีที่แล้ว

    This version of the SRA toolkit does not contain the command to prefetch we are now on version 3.0.0 and this is version 2.1.1 it is not useful anymore. The tutorial needs to be updated.

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว +3

      Thanks for pointing it out! There are two ways to download Runs - using prefetch or using fasterq-dump (www.ncbi.nlm.nih.gov/sra/docs/sradownload/). If you prefer to download Runs using the former method, you should use the newer version.
      Having said that, there will be more updates in the future for SRA toolkit (like every bioinformatics tool). The idea behind this tutorial is to demonstrate how one can use an existing package to download sequencing runs from NCBI. It is common practice among bioinformaticians to make sure they are using the updated/alternate functions that come along with these newer versions.

  • @samiislampathan8367
    @samiislampathan8367 2 ปีที่แล้ว

    It will more better if the directors voice a little bit increased

    • @Bioinformagician
      @Bioinformagician  2 ปีที่แล้ว

      I shall take that into consideration next time. Thanks!

  • @andreasstuermer4946
    @andreasstuermer4946 ปีที่แล้ว

    why don't they just give you the "download as" window, and then you click "download to folder downloads"

  • @rajathkumarp853
    @rajathkumarp853 ปีที่แล้ว

    Please can we have call, I am struggling in ngs cancer profiling project