Publication

Gene and translation initiation site prediction in metagenomic sequences Public

Doug Hyatt, Philip F. LoCascio, Loren J. Hauser, and Edward C. Uberbacher 2012 July 03 Bioinformatics (2012) 28 (17): 2223-2230. doi: 10.1093/bioinformatics/bts429

Abstract

Motivation: Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data.
Results: We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translation initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.
Availability: The Prodigal software is freely available under the
General Public License from http://code.google.com/p/prodigal/.

Highlights

Fig 1
Example of best worst distance and recognizer in cluster

Citation

Doug Hyatt, Philip F. LoCascio, Loren J. Hauser, and Edward C. Uberbacher Gene and translation initiation site prediction in metagenomic sequences Bioinformatics 2012 28: 2223-2230.


Publication Related Files

File Upload Date
1 Bioinformatics-2012-Hyatt-2223-30.pdf 3 years ago