Hello Thanks for this video 🤗 I want to know if it’s possible to import information for 1000 sequences through a file containing differents accession numbers of sequences. Thanks for you help
Yes it is not difficult. It is however somewhat slow due to the speed at which the records are pulled from ncbi but not nearly as slow as downloading the sequences manually. The code to do this is as follows: ```{r} library(traits) #First read each line from the text file and put it into a df cell accessionsdf
You can also download any of the other values for each accession number. The available values that are shown at www.rdocumentation.org/packages/traits/versions/0.5.0/topics/ncbi_byid, are: taxon - taxonomic name (may include some junk, but hard to parse off) taxonomy - organism lineage gene_desc - gene description organelle - if mitochondrial or chloroplast gi_no - GI number acc_no - accession number keyword - if official DNA barcode specimen_voucher - museum/lab accession number of vouchered material lat_lon - longitude/latitude of specimen collection event country - country/location of specimen collection event paper_title - title of study journal - journal study published in (if published) first_author - first author of study uploaded_date - date sequence was uploaded to GenBank length - sequence length sequence - sequence character string
Thanks a lot for your help 🥳 I do this successfully without major problem A little problem was the number of sequences to insert in the loop but finally I managed to download sequences and informations by step of 200 and after I did a “rbind” for merging all informations in one data frame Thanks again 🙏🏼🙏🏼
@@bioinformaticswithease2904 Thank you very much for your help, Sir. I want to ask how to import downloaded sequences or my lab own sequences data set to R?
Sir, I find your videos very helpful please upload more like this
Thank you so much Sir
Thanks a lot ☺️
Hello
Thanks for this video 🤗
I want to know if it’s possible to import information for 1000 sequences through a file containing differents accession numbers of sequences.
Thanks for you help
Yes it is not difficult. It is however somewhat slow due to the speed at which the records are pulled from ncbi but not nearly as slow as downloading the sequences manually. The code to do this is as follows:
```{r}
library(traits)
#First read each line from the text file and put it into a df cell
accessionsdf
You can also download any of the other values for each accession number. The available values that are shown at www.rdocumentation.org/packages/traits/versions/0.5.0/topics/ncbi_byid, are:
taxon - taxonomic name (may include some junk, but hard to parse off)
taxonomy - organism lineage
gene_desc - gene description
organelle - if mitochondrial or chloroplast
gi_no - GI number
acc_no - accession number
keyword - if official DNA barcode
specimen_voucher - museum/lab accession number of vouchered material
lat_lon - longitude/latitude of specimen collection event
country - country/location of specimen collection event
paper_title - title of study
journal - journal study published in (if published)
first_author - first author of study
uploaded_date - date sequence was uploaded to GenBank
length - sequence length
sequence - sequence character string
Thanks a lot for your help 🥳
I do this successfully without major problem
A little problem was the number of sequences to insert in the loop but finally I managed to download sequences and informations by step of 200 and after I did a “rbind” for merging all informations in one data frame
Thanks again 🙏🏼🙏🏼
@@bioinformaticswithease2904 Thank you very much for your help, Sir. I want to ask how to import downloaded sequences or my lab own sequences data set to R?
How do we do this the other way? I’m trying to give R a dna sequence to search blast with and print the results of the blast search