Whole Genome Sequence Analysis | Bacterial Genome Analysis | Bioinformatics 101 for Beginners

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 พ.ย. 2024

ความคิดเห็น • 96

  • @humphreyaddy7716
    @humphreyaddy7716 ปีที่แล้ว

    I wish I discovered this channel long ago. I have all the resources to become a good in the area of bioinformatics.

  • @yusufomowumi4771
    @yusufomowumi4771 3 ปีที่แล้ว +4

    This video was very helpful. Can you do a tutorial on how to detect contamination from reads? Thank you!

  • @SobinSGupta-vq3zn
    @SobinSGupta-vq3zn 2 ปีที่แล้ว +2

    your tutorial is amazing but when I try to follow the same steps automatically it loads the the sequence you have used for demonstration. How can I work with my own sequence following the same steps mentioned in the video. please reply me as early as possible. I will be really thankful

  • @jesusgiovanimamani5671
    @jesusgiovanimamani5671 8 หลายเดือนก่อน +1

    thank you so much. But i have some doubts, Im using MacOS terminal, and I failed installing the environment. yaml. Is the problem for the type of OS? Is this tutorial only for Linux command?

    • @bluefox_genshin
      @bluefox_genshin 6 หลายเดือนก่อน

      Hi, I'm also experiencing the same thing. :(

  • @bioinformaticscoach
    @bioinformaticscoach  2 ปีที่แล้ว

    One-on-one coaching
    ______________________________________________________________________________________________
    clarity.fm/vincentappiah
    Reach out
    ______________________________________________________________________________________________
    bioinformaticscoach@gmail.com

  • @elizabethgyamfi1617
    @elizabethgyamfi1617 3 ปีที่แล้ว +2

    Great work. Simplified presentation. Well done

  • @johirislam8174
    @johirislam8174 ปีที่แล้ว

    hlw. does this lectures covers the WGS data analysis from initial to final in linux ??? I mean from quality check to variant calling variant annotation??

  • @alita2220
    @alita2220 ปีที่แล้ว

    This is an amazing tutorial, thank you! Because the sequence data is short + long, I am changing a few softwares for pacbio hifi data, it teaches me how to fish, it would be great if in the future there are videos for calling variants!

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว +1

      You can watch the tutorial on snippy, bcftools and freebayes.

  • @dr.maqsoodahmad8572
    @dr.maqsoodahmad8572 3 หลายเดือนก่อน

    Great work, need more videos

  • @naveedkhan-fi6ux
    @naveedkhan-fi6ux 2 ปีที่แล้ว

    a great piece of work..... awesome explanation, make it easy to follow....... I wish you could upload a video for fungus comparative genome to sort out the effector

  • @kubrateksen8845
    @kubrateksen8845 2 ปีที่แล้ว

    Amazing, we are waiting more videos.

  • @manishvictor5293
    @manishvictor5293 2 ปีที่แล้ว +1

    Dear Dr. Vappiah very nice GitHub page and description of the same in the video.
    I am having problem in the ./polish.sh the program runs fine but in the end it returns
    CP: cannot stat 'pilon_stage1.fasta': no such file or directory
    Cat:polishing_process/pilon_stage1.changes:No such file or directory.
    Please can you sort the error

    • @bioinformaticscoach
      @bioinformaticscoach  2 ปีที่แล้ว +1

      Its likely you missed a step. Try to start the analysis from beginning

  • @yushanlin2745
    @yushanlin2745 2 ปีที่แล้ว

    Thank you for such an amazing video, it really help me a lot with my research.
    I have a confusion: is reorder indispensable for bacterium assembly? Whether ignoring reorder affeccts pangenome analysis. I have finished mlst, detect virulence gene and it doesn't matter.
    My data is iIllumina NovaSeq Paired-end, 2×150bp. I read paper of Ragtag and find the data is long-read genome sequencing (average 15 kbp ) and from plant.
    Looking forward for your reply. Thans again.

    • @bioinformaticscoach
      @bioinformaticscoach  2 ปีที่แล้ว

      Reordering is not really necessary for pangenome. But I advise you do that if you want to generate a draft sequence of your sample.BEcause it maps your sequence to a reference genome and reorder the contigs using the reference genome as template. So the sequence is you get is better than the raw assembly assembly contigs.

  • @muhammadshafiq3242
    @muhammadshafiq3242 3 ปีที่แล้ว

    Hello, Sir, I have a problem with trimming. Could you kindly help me? When I write the skip it does not run for trimming.

  • @ldipotet
    @ldipotet 3 ปีที่แล้ว

    that's amazing work you have done here !! congrats

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      Thanks. Expect more of such videos soon.

    • @ldipotet
      @ldipotet 3 ปีที่แล้ว

      @@bioinformaticscoach A challenge that could be interesting could be all these commands in a CWL pipeline.

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      @@ldipotet Yes. That will interesting. Maybe we can take it up in the future.

  • @sciforlife
    @sciforlife 3 ปีที่แล้ว

    I have to work on project for this semester, and I want to do bioinformatics study on microbial data? I just need a direction..like what studies do I can do using bioinformatics techniques, or machine learning?

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว +1

      That will be nice. First you need to read some papers to get to know what kinds of bioinformatics studies are done in the field. And that is why I made this video on bacterial genome analysis. You can do a similar analysis with the pipeline I demonstrated and use the explanation I gave as a guide. For the machine learning, you need to first identify your area of interest and look at how machine learning is being applied in that area. There are lots of dataset available for you to use. Just identify your area of interest and you will be able to connect the dots. For example if I am interested in cancer studies, then I would look at how machine learning is used to predict cancer. Get to know what datasets are available and choose the one that works for you.

    • @sciforlife
      @sciforlife 3 ปีที่แล้ว

      @@bioinformaticscoach thank you soo much. May God bless you

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      @@sciforlife You are welcome. You can also share to those who need it.

  • @muhammadshafiq3242
    @muhammadshafiq3242 3 ปีที่แล้ว

    Very Nice tutorial. Can I use XFTP and XShell instead of anaconda to to such kind of anlaysis
    ? Thanks

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      Anaconda is used to install the tools. So it is important you install it. But if you have a server that has all the tools installed then you don't need to install it.
      XFTP and XShell are used to login to ssh servers. You use them if you are accessing the remote Server

    • @muhammadshafiq7141
      @muhammadshafiq7141 3 ปีที่แล้ว

      @@bioinformaticscoach yes we have these servers providing by school, can i use these method's in xshell and xptp, which you used in this tutorial. I just watch the video.

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      @@muhammadshafiq7141 you may want to ask your system admin for this. I personally use a linux system so I use my terminal to login. I have also used mobaxterm on windows.

  • @MrManikprabhu
    @MrManikprabhu ปีที่แล้ว

    Hi, is it possible to make venn diagram for five or six genomes?

  • @ehecatl3830
    @ehecatl3830 2 ปีที่แล้ว

    Thanks Dr. Your are very good!!!!!

  • @naveedkhan-fi6ux
    @naveedkhan-fi6ux ปีที่แล้ว

    Hi dear..... I was following your guideline for BRIGS but I can not able to compare my genomes because it shows the error of having big genome size, my specie genome size is 41mb so what other tool I can use for genome comparision

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว

      You can use Circos. Alternatively you can book a session with me and we can discuss further

  • @ayoajayi280
    @ayoajayi280 3 ปีที่แล้ว +1

    Hello. I love this presentation. I am a beginner, can someone please quickly take me through the system requirement, how I can get or install Linux, how I can get it installed with some of the tools for genome analysis. Thanks.

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว +3

      First of all there are different flavors of linux (Ubuntu, CentOS,etc). They are all free to download and install. You can install in a virtual environment using the virtual box tool. Once you do that you can send a notice and we pick it up from there.

    • @ayoajayi280
      @ayoajayi280 3 ปีที่แล้ว

      @@bioinformaticscoach Thanks. Please what is the minimum system requirement that will be ideal for analysis of bacterial genomes and installation of those tools

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว +3

      @@ayoajayi280 I will recommend a core i7 3.40GHz., 16GB RAM (32GB or higher will be great) and 1TB storage. I will recommend you install Linux as the main operating system instead of the virtual box approach.

    • @nmg1909
      @nmg1909 3 ปีที่แล้ว

      @@bioinformaticscoach I love your presentation here. I have been searching for a bacteria population dataset for my research: "Biocorrosion detection in structures" I would appreciate if you can point me to a link where I can get the microbial organism population dataset. Thanks.

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว +1

      @@nmg1909 What I do is search for papers on bacterial genome analysis. Usually they show the list of genomes used and you can download them.
      Here is an example of a dataset: cge.cbs.dtu.dk/services/evolution_data.php
      We can discuss this further on my facebook page ( web.facebook.com/Bioinformatics-Coach-100614805459525 ) or twitter ( @BioinfoCoach )

  • @biozarrice
    @biozarrice 3 ปีที่แล้ว

    Good Morning. I think your bioinformatics tutorials are amazing. Could you do a tutorial on genome annotation of eukaryotic organisms?

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว +1

      Thanks Fernando for the suggestion. I will consider that.

  • @anindorahman2600
    @anindorahman2600 2 ปีที่แล้ว +1

    Hlew sir i have query, Conda env create -f environment.yaml, This code isnt working. Its remaining in the solving environment state for 2-3 days but still dosent work i have done all the things but dosent work can you please help in this matter

    • @bioinformaticscoach
      @bioinformaticscoach  2 ปีที่แล้ว +1

      Try updating your conda . If you still have issues, you can book a session with me and we can look at it.

    • @anindorahman2600
      @anindorahman2600 2 ปีที่แล้ว

      When are you available sir? I want to book e session... It still dosent work

    • @bioinformaticscoach
      @bioinformaticscoach  2 ปีที่แล้ว

      @@anindorahman2600 You can request a session here: clarity.fm/vincentappiah

  • @bioinformaticscoach
    @bioinformaticscoach  2 ปีที่แล้ว

    One-on-one coaching: calendly.com/bioinformaticscoach

  • @josephwestley789
    @josephwestley789 3 ปีที่แล้ว

    Hello, this is a great tutorial, thank you for putting it together!
    I am encountering an error when trying to run ./polish.sh. I am getting "Unable to access jarfile /bacterial-genomics-tutorial/apps/pilon.jar". Do you have any idea of what might be causing this error?
    Thanks in advance!
    EDIT: I am doing this is WSL1 by the way.

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว +1

      This pipeline was designed to run directly on the bash. If you are having this error, then you have to modify the script and put the path of the pilon jar file in it. Or check to make sure the pilon jar file has been downloaded

    • @josephwestley789
      @josephwestley789 3 ปีที่แล้ว +1

      @@bioinformaticscoach EDIT: Thank you for your reply, I had not extracted the jar folder contents. I have done so now, and it appears to be running!

  • @nickalbbar
    @nickalbbar ปีที่แล้ว +1

    First of all, thanks for this very very helpful video. I was following the pipe line, but got stuck in an error in the step where you run the reorder_contigs script. It starts to run, but then i got the following message """
    Traceback (most recent call last):
    File "extract_reordered.py", line 13, in
    reordered=[i for i in allseq if 'RagTag' in i.id and ID in i.id][0]
    IndexError: list index out of range """
    Then it doesnt generate the P7741.reordered.fasta.
    I've tried to repeat the process, but can't find a solution
    What should i do?

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว

      Hi @nickalbbar. Are you running the pipeline on your own dataset or the data provided in the tutorial?

    • @nickalbbar
      @nickalbbar ปีที่แล้ว

      @@bioinformaticscoach i'm using the dataset provided in the tutorial

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว +1

      Hi @nickalbbar. I am investigating the issue. I will get back to you

    • @nickalbbar
      @nickalbbar ปีที่แล้ว +1

      @@bioinformaticscoach OK!! Once again, thank you so much

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว +1

      @@nickalbbar In the meantime you can watch this tutorial. I am sure it will be useful: th-cam.com/video/DKjGUwpCTDA/w-d-xo.html

  • @RuqaiyaTasneem-z5w
    @RuqaiyaTasneem-z5w 6 หลายเดือนก่อน

    hello, i am having issues trying to create the env.yaml in conda even after updating conda ...
    it says- warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY
    I am using WSL

    • @RuqaiyaTasneem-z5w
      @RuqaiyaTasneem-z5w 6 หลายเดือนก่อน

      Could not solve for environment specs
      The following packages are incompatible...
      its talking about bioperl

    • @RuqaiyaTasneem-z5w
      @RuqaiyaTasneem-z5w 6 หลายเดือนก่อน

      i have installed perl but it's showing the same issue

  • @leonmaric5055
    @leonmaric5055 2 ปีที่แล้ว

    Very helpful! greetings

  • @purvagohil2240
    @purvagohil2240 3 ปีที่แล้ว

    Is this possible for single-end reads from ion torrent?

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      Yes. the procedure can be applied. This paper may help: dl.acm.org/doi/10.1145/3093338.3093362

  • @rajneeshdadwal
    @rajneeshdadwal ปีที่แล้ว

    I wish to extract the draft genome from ragtag output how can i do the same??

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว

      I do this by using some python codes. You can modify the extract_reordered.py file and use it to extract the draft sequence. If you still have issues, then you can book a session with me.

  • @luisrendon5792
    @luisrendon5792 3 ปีที่แล้ว

    Hello, I'm still with problems in the step: reorder_contigs.sh... I've repeated the pipeline several timer but I can't continue, how can I solve this? Thanks

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      What's the error message that is displayed?

    • @luisrendon5792
      @luisrendon5792 3 ปีที่แล้ว

      @@bioinformaticscoach when I execute: ./reorder_contigs.sh I have not results, this what the result told me: FileNotFoundError: [Errno 2] No such file or directory: 'P7741_reordered/ragtag.scaffolds.fasta'

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      @@luisrendon5792 Its likely you missed one of the steps.I would like you to take your time and repeat them. Also, are you running the commands on Linux or MacOS?

    • @graphiomics
      @graphiomics 2 ปีที่แล้ว

      I also got an error in this process when I used my own sequences. "ragtag.scaffolds.fasta" is not found. It shows something wrong with the reference genome. Your help is an emergence. Thanks a lot for this tutorial.

    • @sidratahir3645
      @sidratahir3645 2 ปีที่แล้ว

      @@bioinformaticscoach Traceback (most recent call last):
      File "/home/sar/bacterial-genomics-tutorial/extract_reordered.py", line 10, in
      allseq=[i for i in SeqIO.parse(fastafile,'fasta')]
      File "/home/sar/miniconda3/envs/bacterial-genomics-tutorial/lib/python3.10/site-packages/Bio/SeqIO/__init__.py", line 605, in parse
      return iterator_generator(handle)
      File "/home/sar/miniconda3/envs/bacterial-genomics-tutorial/lib/python3.10/site-packages/Bio/SeqIO/FastaIO.py", line 183, in __init__
      super().__init__(source, mode="t", fmt="Fasta")
      File "/home/sar/miniconda3/envs/bacterial-genomics-tutorial/lib/python3.10/site-packages/Bio/SeqIO/Interfaces.py", line 48, in __init__
      self.stream = open(source, "r" + mode)
      FileNotFoundError: [Errno 2] No such file or directory: 'P7741_reordered/ragtag.scaffolds.fasta'
      this error occured while using ./reorder_contigs.sh?

  • @raselbarua4578
    @raselbarua4578 3 ปีที่แล้ว

    Good job

  • @billclintonaglomasa6543
    @billclintonaglomasa6543 3 ปีที่แล้ว +1

    Great.

  • @ldipotet
    @ldipotet 3 ปีที่แล้ว

    Hi Vincent I was trying yout our pipeline and I found that in my scenario spades.py fails wiht the option --carefull so I had to ran it with the --isolate option and the result is the same like you when running it with --carefull option. I guess that it is due to spade.py software version or any other aspect in this environment BUT in my scenario with --carefull generate an execution in Standar mode and rise different exceptions related with some internal compression processes. I'm new in these kind of ecosystem so what determine the the version of intalled software? because in your environment.yaml you never indicate any version. In my case I do it in my docker file that installs first the installation of my platform and after that I tailor every specific thing that I need for every especific channel.
    thanks in advance and a hint on this would be appreciated ..

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      Hi @ldg, thanks for the message. For anaconda, if you don't specify a software version , it uses the most recent one. the --careful option works with spades 3.14 and upwards. So if you got the error , the its likely your spades version is lower than 3.14. Thanks for the suggestion as well

    • @ldipotet
      @ldipotet 3 ปีที่แล้ว

      @@bioinformaticscoach The version that I am running: SPAdes genome assembler v3.15.3.
      The manual indicate about Isolate : "This option is not compatible with --only-error-correction or --careful options." Thank you so much for your answer and for the clarification about versions management in Anaconda.

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      @@ldipotet Yes its true about the compatibility. So you have to choose.

  • @abdullahijama690
    @abdullahijama690 3 ปีที่แล้ว

    Thanks for your tutorial and I have learnt a lot from this tutorial. I have problem when I was doing bacterial-genomics-tutorial; when I want to create conda env create --quiet -f environment.yaml : Solving environment: ...working... failed
    ResolvePackageNotFound:
    - sratoolkit
    I am getting this message!

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      I have made modification to the yaml file. Please run the command again and let me know if it works

    • @bioinformaticscoach
      @bioinformaticscoach  3 ปีที่แล้ว

      Please you have to download the updated yaml file or manually edit the yaml file and remove the line with the sratoolkit

  • @sheynjila2457
    @sheynjila2457 ปีที่แล้ว

    I am encountering the following problems when installing the python packages:
    bacterial-genomics-tutorial> conda env create --quiet -f environment.yaml
    Retrieving notices: ...working... done
    Collecting package metadata (repodata.json): ...working... done
    Solving environment: ...working... failed
    ResolvePackageNotFound:
    - porechop
    - mash
    - samtools
    - spades
    - perl-db-file
    - roary
    - sra-tools
    - perl-padwalker
    - sickle-trim
    - bwa
    - mafft
    - minimap2
    - mummer

    • @bioinformaticscoach
      @bioinformaticscoach  ปีที่แล้ว

      Try updating your conda before installing the packages

    • @sheynjila2457
      @sheynjila2457 ปีที่แล้ว

      @@bioinformaticscoach Thanks for the reply. I have updated conda but it has not changed.