Protein Coding Sequence Identification by Simultaneously Characterizing the Periodic and Random Features of DNA Sequences

Gao, Jianbo; Qi, Yan; Cao, Yinhe; Tung, Wen-wen

doi:https://doi.org/10.1155/JBB.2005.139

BioMed Research International

On this page

Abstract Copyright Related Articles

Special Issue

Data Mining in Genomics and Proteomics

View this Special Issue

Research article | Open Access

Volume 2005 | Article ID 371096 | https://doi.org/10.1155/JBB.2005.139

Protein Coding Sequence Identification by Simultaneously Characterizing the Periodic and Random Features of DNA Sequences

Jianbo Gao,¹Yan Qi,²Yinhe Cao,³and Wen-wen Tung⁴

Received24 May 2004

Revised30 Aug 2004

Accepted03 Sept 2004

Abstract

Most codon indices used today are based on highly biased nonrandom usage of codons in coding regions. The background of a coding or noncoding DNA sequence, however, is fairly random, and can be characterized as a random fractal. When a gene-finding algorithm incorporates multiple sources of information about coding regions, it becomes more successful. It is thus highly desirable to develop new and efficient codon indices by simultaneously characterizing the fractal and periodic features of a DNA sequence. In this paper, we describe a novel way of achieving this goal. The efficiency of the new codon index is evaluated by studying all of the 16 yeast chromosomes. In particular, we show that the method automatically and correctly identifies which of the three reading frames is the one that contains a gene.

Copyright

Copyright © 2005 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation Order printed copies

Views

604

Downloads

883

Citations