- About this Journal ·
- Abstracting and Indexing ·
- Advance Access ·
- Aims and Scope ·
- Article Processing Charges ·
- Articles in Press ·
- Author Guidelines ·
- Bibliographic Information ·
- Citations to this Journal ·
- Contact Information ·
- Editorial Board ·
- Editorial Workflow ·
- Free eTOC Alerts ·
- Publication Ethics ·
- Reviewers Acknowledgment ·
- Submit a Manuscript ·
- Subscription Information ·
- Table of Contents
Advances in Artificial Intelligence
Volume 2009 (2009), Article ID 219743, 11 pages
Bayesian Unsupervised Learning of DNA Regulatory Binding Regions
1Department of Mathematics, Åbo Akademi University, 20500 Turku, Finland
2Department of Mathematics, University of Linköping, 58183 Linköping, Sweden
3Department of Mathematics, The Royal Institute of Technology, 100 44 Stockholm, Sweden
Received 13 February 2009; Revised 6 June 2009; Accepted 2 July 2009
Academic Editor: Djamel Bouchaffra
Copyright © 2009 Jukka Corander et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
- T. Werner, “Models for prediction and recognition of eukaryotic promoters,” Mammalian Genome, vol. 10, no. 2, pp. 168–175, 1999.
- E. Eskin and P. A. Pevzner, “Finding composite regulatory patterns in DNA sequences,” Bioinformatics, vol. 18, supplement 1, pp. S354–363, 2002.
- L. Marsan and M.-F. Sagot, “Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification,” Journal of Computational Biology, vol. 7, no. 3-4, pp. 345–362, 2000.
- U. Ohler and H. Niemann, “Identification and analysis of eukaryotic promoters: recent computational approaches,” Trends in Genetics, vol. 17, no. 2, pp. 56–60, 2001.
- J. S. Liu, A. F. Neuwald, and C. E. Lawrence, “Bayesian models for multiple local sequence alignment and Gibbs sampling strategies,” Journal of the American Statistical Association, vol. 90, pp. 1156–1170, 1995.
- W. Thompson, E. C. Rouchka, and C. E. Lawrence, “Gibbs recursive sampler: finding transcription factor binding sites,” Nucleic Acids Research, vol. 31, no. 13, pp. 3580–3585, 2003.
- M. Gupta and J. S. Liu, “Discovery of conserved sequence patterns using a stochastic dictionary model,” Journal of the American Statistical Association, vol. 98, no. 461, pp. 55–66, 2003.
- S. T. Jensen, X. S. Liu, Q. Zhou, and J. S. Liu, “Computational discovery of gene regulatory binding motifs: a Bayesian perspective,” Statistical Science, vol. 19, no. 1, pp. 188–204, 2004.
- S. T. Jensen and J. S. Liu, “BioOptimizer: a Bayesian scoring function approach to motif discovery,” Bioinformatics, vol. 20, no. 10, pp. 1557–1564, 2004.
- E. P. Xing, W. Wu, M. I. Jordan, and R. M. Karp, “Logos: a modular Bayesian model for de novo motif detection,” Journal of Bioinformatics and Computational Biology, vol. 2, no. 1, pp. 127–154, 2004.
- I. Ben-Gal, A. Shani, A. Gohr, et al., “Identification of transcription factor binding sites with variable-order Bayesian networks,” Bioinformatics, vol. 21, no. 11, pp. 2657–2666, 2005.
- L. Hertzberg, O. Zuk, G. Getz, and E. Domany, “Finding motifs in promoter regions,” Journal of Computational Biology, vol. 12, no. 3, pp. 314–330, 2005.
- Y. Barash, G. Elidan, N. Friedman, and T. Kaplan, “Modeling dependencies in protein-DNA binding sites,” in Proceedings of the 7th Annual International Conference on Computational Molecular Biology (RECOMB '03), pp. 28–37, ACM Press, Berlin, Germany, April 2003.
- X. Zhang, H. Huang, M. Li, and T. Speed, “Finding short DNA motifs using permuted Markov models,” Bioinformatics, vol. 21, pp. 894–906, 2005.
- S. M. Li, J. Wakefield, and S. Self, “A transdimensional Bayesian model for pattern recognition in DNA sequences,” Biostatistics, vol. 9, no. 4, pp. 668–685, 2008.
- J. Hawkins, C. Grant, W. S. Noble, and T. L. Bailey, “Assessing phylogenetic motif models for predicting transcription factor binding sites,” Bioinformatics, vol. 25, no. 12, pp. i339–i347, 2009.
- T. Marschall and S. Rahmann, “Efficient exact motif discovery,” Bioinformatics, vol. 25, no. 12, pp. i356–i364, 2009.
- P. Bühlmann and A. J. Wyner, “Variable length Markov chains,” Annals of Statistics, vol. 27, no. 2, pp. 480–513, 1999.
- M. Mächler and P. Bühlmann, “Variable length Markov chains: methodology, computing, and software,” Journal of Computational and Graphical Statistics, vol. 13, no. 2, pp. 435–455, 2004.
- J. Rissanen, “A universal data compression system,” IEEE Transactions on Information Theory, vol. 29, no. 5, pp. 656–664, 1983.
- I. Abnizova and W. R. Gilks, “Studying statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the eukaryotic genomes,” Briefings in Bioinformatics, vol. 7, no. 1, pp. 48–54, 2006.
- M. C. Frith, Y. Fu, L. Yu, J.-F. Chen, U. Hansen, and Z. Weng, “Detection of functional DNA motifs via statistical over-representation,” Nucleic Acids Research, vol. 32, no. 4, pp. 1372–1381, 2004.
- J. Zhang, B. Jiang, M. Li, J. Tromp, X. Zhang, and M. Q. Zhang, “Computing exact P-values for DNA motifs,” Bioinformatics, vol. 23, no. 5, pp. 531–537, 2007.
- C. P. Robert and G. Casella, Monte Carlo Statistical Methods, Springer, New York, NY, USA, 1999.
- P. Green, “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination,” Biometrika, vol. 82, pp. 711–732, 1995.
- J. Corander, M. Gyllenberg, and T. Koski, “Bayesian model learning based on a parallel MCMC strategy,” Statistics and Computing, vol. 16, no. 4, pp. 355–362, 2006.
- E. E. Stückle, C. Emmrich, U. Grob, and P. J. Nielsen, “Statistical analysis of nucleotide sequences,” Nucleic Acids Research, vol. 18, no. 22, pp. 6641–6647, 1990.
- B. Ron, Y. Singer, and N. Tishby, “The power of amnesia: learning of probabilistic automata with variable memory lengths,” Machine Learning, vol. 25, pp. 117–149, 1996.
- M. Régnier, “A unified approach to word occurrence probabilities,” Discrete Applied Mathematics, vol. 104, pp. 259–280, 2000.
- T. Erhardsson, “Compound Poisson approximation for Markov chains using Stein's method,” Annals of Probability, vol. 27, no. 1, pp. 565–596, 1999.
- T. Erhardsson, “Compound Poisson approximation for counts of rare patterns in Markov chains and extreme sojourns in birth-death chains,” Annals of Applied Probability, vol. 10, no. 2, pp. 573–591, 2000.
- J. L. Thorne, H. Kishino, and J. Felsenstein, “Inching towards reality: an improved likelihood model for sequence evolution,” Journal of Molecular Evolution, vol. 34, pp. 3–16, 1992.
- J. Corander, M. Gyllenberg, and T. Koski, “Random partition models and exchangeability for bayesian identification of population structure,” Bulletin of Mathematical Biology, vol. 69, no. 3, pp. 797–815, 2007.
- D. Geiger and D. Heckerman, “A characterization of the Dirichlet distribution through global and local parameter independence,” Annals of Statistics, vol. 25, no. 3, pp. 1344–1369, 1997.
- P. Marttinen, J. Corander, P. Törönen, and L. Holm, “Bayesian search of functionally divergent protein subgroups and their function specific residues,” Bioinformatics, vol. 22, no. 20, pp. 2466–2474, 2006.
- J. Zhu and M. Q. Zhang, “SCPD: a promoter database of the yeast Saccharomyces cerevisiae,” Bioinformatics, vol. 15, no. 7-8, pp. 607–611, 1999.
- B. G. Mirkin and L. B. Chernyi, “Measurement of the distance between distinct partitions of a finite set of objects,” Automation and Remote Control, vol. 31, pp. 786–792, 1970.
- L. Hubert and P. Arabie, “Comparing partitions,” Journal of Classification, vol. 2, no. 1, pp. 193–218, 1985.
- S. Sinha and M. Tompa, “Discovery of novel transcription factor binding sites by statistical overrepresentation,” Nucleic Acids Research, vol. 30, no. 24, pp. 5549–5560, 2002.
- G. Pavesi, P. Mereghetti, G. Mauri, and G. Pesole, “Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes,” Nucleic Acids Research, vol. 32, web server issue, pp. W199–W203, 2004.
- M. Tompa, N. Li, T. L. Bailey, et al., “Assessing computational tools for the discovery of transcription factor binding sites,” Nature Biotechnology, vol. 23, no. 1, pp. 137–144, 2005.