Depending on how you want to use your package, you may decide to build your database only when updated, or once per day, and save the db object in a Rdata or Rds format. The actual classification script would load the object from file and not bother with generating the database all the time.
Yeah, I agree - I'll probably end up storing it in a data package for a specific kmer size that people could install from github. Some of the tangents I go on here are pedagogical and not always critical (like shaving 3 seconds :) )
awesome video, as usual!
thank you so much for this content pat! i always learn something new!
wonderful - thanks for watching!
Depending on how you want to use your package, you may decide to build your database only when updated, or once per day, and save the db object in a Rdata or Rds format. The actual classification script would load the object from file and not bother with generating the database all the time.
Yeah, I agree - I'll probably end up storing it in a data package for a specific kmer size that people could install from github. Some of the tangents I go on here are pedagogical and not always critical (like shaving 3 seconds :) )
Not sure if this (general) question makes sense but I wonder how many candidate genera are close to the maximum prob ?
It will vary by family. That's an interesting question though!