A Re-Annotation of the Saccharomyces cerevisiae Genome
Discrepancies in gene and orphan number indicated by previous analyses suggest that S. cerevisiae would benefit from a consistent re-annotation. In this analysis three new genes are identified and 46 alterations to gene coordinates are described. 370 ORFs are defined as totally spurious ORFs which should be disregarded. At least a further 193 genes could be described as very hypothetical, based on a number of criteria. It was found that disparate genes with sequence overlaps over ten amino acids (especially at the N-terminus) are rare in both S. cerevisiae and Sz. pombe. A new S. cerevisiae gene number estimate with an upper limit of 5804 is proposed, but after the removal of very hypothetical genes and pseudogenes this is reduced to 5570. Although this is likely to be closer to the true upper limit, it is still predicted to be an overestimate of gene number. A complete list of revised gene coordinates is available from the Sanger Centre (S. cerevisiae reannotation: ftp://ftp/pub/yeast/SCreannotation).