Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015 (2015), Article ID 853461, 10 pages
http://dx.doi.org/10.1155/2015/853461
Research Article

An Affinity Propagation-Based DNA Motif Discovery Algorithm

School of Computer Science and Technology, Xidian University, Xi’an 710071, China

Received 3 January 2015; Revised 10 June 2015; Accepted 11 June 2015

Academic Editor: Graziano Pesole

Copyright © 2015 Chunxiao Sun et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. G. D. Stormo, “DNA binding sites: representation and discovery,” Bioinformatics, vol. 16, no. 1, pp. 16–23, 2000. View at Publisher · View at Google Scholar · View at Scopus
  2. P. A. Pevzner and S.-H. Sze, “Combinatorial approaches to finding subtle signals in DNA sequences,” in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, pp. 269–278, San Diego, Calif, USA, August 2000.
  3. T. D. Schneider, “Consensus sequence zen,” Applied Bioinformaitics, vol. 1, no. 3, pp. 111–119, 2002. View at Google Scholar · View at Scopus
  4. Q. Yu, H. Huo, Y. Zhang, and H. Guo, “PairMotif: a new pattern-driven algorithm for planted (l, d) DNA motif search,” PLoS ONE, vol. 7, no. 10, Article ID e48442, 2012. View at Publisher · View at Google Scholar · View at Scopus
  5. F. Y. L. Chin and H. C. M. Leung, “Voting algorithms for discovering long motifs,” in Proceedings of the 3rd Asia-Pacific Bioinformatics Conference (APBC '05), pp. 261–271, January 2005. View at Scopus
  6. J. Davila, S. Balla, and S. Rajasekaran, “Fast and practical algorithms for planted (l, d) motif search,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 4, no. 4, pp. 544–552, 2007. View at Publisher · View at Google Scholar · View at Scopus
  7. H. Dinh, S. Rajasekaran, and V. K. Kundeti, “PMS5: an efficient exact algorithm for the (, d)-motif finding problem,” BMC Bioinformatics, vol. 12, article 410, 2011. View at Publisher · View at Google Scholar · View at Scopus
  8. E. S. Ho, C. D. Jakubowski, and S. I. Gunderson, “iTriplet, a rule-based nucleic acid sequence motif finder,” Algorithms for Molecular Biology, vol. 4, no. 1, article 14, 2009. View at Publisher · View at Google Scholar · View at Scopus
  9. H. Dinh, S. Rajasekaran, and J. Davila, “qPMS7: a fast algorithm for finding (l, d)-motifs in DNA and protein sequences,” PLoS ONE, vol. 7, no. 7, Article ID e41425, 2012. View at Publisher · View at Google Scholar · View at Scopus
  10. M. Nicolae and S. Rajasekaran, “Efficient sequential and parallel algorithms for planted motif search,” BMC Bioinformatics, vol. 15, no. 1, article 34, 2014. View at Publisher · View at Google Scholar · View at Scopus
  11. N. Pisanti, A. M. Carvalho, L. Marsan, and M.-F. Sagot, “RISOTTO: fast extraction of motifs with mismatches,” in LATIN 2006: Theoretical Informatics, vol. 7, pp. 757–768, Springer, 2006. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  12. A. Floratou, S. Tata, and J. M. Patel, “Efficient and accurate discovery of patterns in sequence data sets,” IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 8, pp. 1154–1168, 2011. View at Publisher · View at Google Scholar · View at Scopus
  13. M. F. Sagot, “Spelling approximate repeated or common motifs using a suffix tree,” in Proceedings of the 3rd Latin American Symposium on Theoretical Informatics (LATIN '98), pp. 374–390, 1998.
  14. G. Pavesi, G. Mauri, and G. Pesole, “An algorithm for finding signals of unknown length in DNA sequences,” Bioinformatics, vol. 17, supplement 1, pp. S207–S214, 2001, Proceedings of the 9th International Conference on Intelligent Systems for Molecular Biology (ISMB '01). View at Google Scholar
  15. J. D. Thompson, D. G. Higgins, and T. J. Gibson, “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Research, vol. 22, no. 22, pp. 4673–4680, 1994. View at Google Scholar · View at Scopus
  16. T. L. Bailey and C. Elkan, “Fitting a mixture model by expectation maximization to discover motifs in biopolymers,” in Proceedings of the 2nd International Conference on Intelligent Systems for Molecular Biology, pp. 28–36, 1994.
  17. D. Quang and X. Xie, “EXTREME: an online em algorithm for motif discovery,” Bioinformatics, vol. 30, no. 12, pp. 1667–1673, 2014. View at Publisher · View at Google Scholar · View at Scopus
  18. T. L. Bailey, N. Williams, C. Misleh, and W. W. Li, “MEME: discovering and analyzing DNA and protein sequence motifs,” Nucleic Acids Research, vol. 34, supplement 2, pp. W369–W373, 2006. View at Publisher · View at Google Scholar · View at Scopus
  19. G. Z. Hertz and G. D. Stormo, “Identifying DNA and protein patterns with statistically significant alignments of multiple sequences,” Bioinformatics, vol. 15, no. 7-8, pp. 563–577, 1999. View at Publisher · View at Google Scholar · View at Scopus
  20. C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton, “Detecting subtle sequence signals: a gibbs sampling strategy for multiple alignment,” Science, vol. 262, no. 5131, pp. 208–214, 1993. View at Publisher · View at Google Scholar · View at Scopus
  21. J. Buhler and M. Tompa, “Finding motifs using random projections,” Journal of Computational Biology, vol. 9, no. 2, pp. 225–242, 2002. View at Publisher · View at Google Scholar · View at Scopus
  22. C. Bi, “A monte carlo em algorithm for de novo motif discovery in biomolecular sequences,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 6, no. 3, pp. 370–386, 2009. View at Publisher · View at Google Scholar · View at Scopus
  23. Q. Yu, H. Huo, Y. Zhang, H. Guo, and H. Guo, “PairMotif+: a fast and effective algorithm for De Novo motif discovery in DNA sequences,” International Journal of Biological Sciences, vol. 9, no. 4, pp. 412–424, 2013. View at Publisher · View at Google Scholar · View at Scopus
  24. W. Thompson, E. C. Rouchka, and C. E. Lawrence, “Gibbs recursive sampler: finding transcription factor binding sites,” Nucleic Acids Research, vol. 31, no. 13, pp. 3580–3585, 2003. View at Publisher · View at Google Scholar · View at Scopus
  25. F. P. Roth, J. D. Hughes, P. W. Estep, and G. M. Church, “Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation,” Nature Biotechnology, vol. 16, no. 10, pp. 939–945, 1998. View at Publisher · View at Google Scholar · View at Scopus
  26. G. Li, T.-M. Chan, K.-S. Leung, and K.-H. Lee, “A cluster refinement algorithm for motif discovery,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 7, no. 4, pp. 654–668, 2010. View at Publisher · View at Google Scholar · View at Scopus
  27. S. van Dongen, Graph clustering by flow simulation [Ph.D. thesis], University of Utrecht, 2000.
  28. C.-W. Huang, W.-S. Lee, and S.-Y. Hsieh, “An improved heuristic algorithm for finding motif signals in DNA sequences,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 4, pp. 959–975, 2011. View at Publisher · View at Google Scholar · View at Scopus
  29. B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science, vol. 315, no. 5814, pp. 972–976, 2007. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  30. M. Leone, Sumedha, and M. Weigt, “Clustering by soft-constraint affinity propagation: applications to gene-expression data,” Bioinformatics, vol. 23, no. 20, pp. 2708–2715, 2007. View at Publisher · View at Google Scholar · View at Scopus
  31. D. Wang and N. K. Lee, “Computational discovery of motifs using hierarchical clustering techniques,” in Proceedings of the 8th IEEE International Conference on Data Mining (ICDM '08), pp. 1073–1078, December 2008. View at Publisher · View at Google Scholar · View at Scopus
  32. G. E. Crooks, G. Hon, J.-M. Chandonia, and S. E. Brenner, “WebLogo: a sequence logo generator,” Genome Research, vol. 14, no. 6, pp. 1188–1190, 2004. View at Publisher · View at Google Scholar · View at Scopus
  33. M. Tompa, N. Li, T. L. Bailey et al., “Assessing computational tools for the discovery of transcription factor binding sites,” Nature Biotechnology, vol. 23, no. 1, pp. 137–144, 2005. View at Publisher · View at Google Scholar · View at Scopus