Construct Database: Tips
- The larger the database, the longer the program will take to
complete. Therefore, a well designed database is preferable to
one which includes 'everything'.
- When downloading sequence records from the RDP preview release (http://rdp.cme.msu.edu/html/analyses_preview.html),
use the default RDP settings of 'remove common gaps' . This way,
superfluous gap characters are lost whilst the RDP alignment is
retained, so speeding up the program (which must otherwise align the
sequences - re Find
Oligonucleotides step).
- Large Fasta formatted files are analysed faster than equivalent
GenBank files, but...
- GenBank files supply more information to the program than Fasta
files.