Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2015, Article ID 218068, 10 pages
http://dx.doi.org/10.1155/2015/218068
Research Article

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

Department of Automation, School of Electronics and Control Engineering, Chang’An University, Xi’an 710064, China

Received 8 April 2015; Accepted 4 June 2015

Academic Editor: Andre Van Wijnen

Copyright © 2015 Yipu Zhang and Ping Wang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. P. A. Pevzner and S.-H. Sze, “Combinatorial approaches to finding subtle signals in DNA sequences,” in Proceedings of the International Conference on Intelligent Systems for Molecular Biology (ISMB '00), vol. 8, pp. 269–278, 2000.
  2. F. Zambelli, G. Pesole, and G. Pavesi, “Motif discovery and transcription factor binding sites before and after the next-generation sequencing era,” Briefings in Bioinformatics, vol. 14, no. 2, pp. 225–237, 2013. View at Publisher · View at Google Scholar · View at Scopus
  3. E. R. Mardis, “ChIP-seq: welcome to the new frontier,” Nature Methods, vol. 4, no. 8, pp. 613–614, 2007. View at Publisher · View at Google Scholar · View at Scopus
  4. P. J. Park, “ChIP-seq: advantages and challenges of a maturing technology,” Nature Reviews Genetics, vol. 10, no. 10, pp. 669–680, 2009. View at Publisher · View at Google Scholar · View at Scopus
  5. P. Collas and J. A. Dahl, “Chop it, ChIP it, check it: the current status of chromatin immunoprecipitation,” Frontiers in Bioscience, vol. 13, no. 17, pp. 929–943, 2008. View at Google Scholar
  6. H. S. Rhee and B. F. Pugh, “Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution,” Cell, vol. 147, no. 6, pp. 1408–1419, 2011. View at Publisher · View at Google Scholar · View at Scopus
  7. C. Jia, M. B. Carson, Y. Wang, Y. Lin, and H. Lu, “A new exhaustive method and strategy for finding motifs in ChIP-enriched regions,” PLoS ONE, vol. 9, no. 1, Article ID e86044, 2014. View at Publisher · View at Google Scholar · View at Scopus
  8. X. Shirley Liu, D. L. Brutlag, and J. S. Liu, “An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments,” Nature Biotechnology, vol. 20, no. 8, pp. 835–839, 2002. View at Publisher · View at Google Scholar · View at Scopus
  9. P. Machanick and T. L. Bailey, “MEME-ChIP: motif analysis of large DNA datasets,” Bioinformatics, vol. 27, no. 12, Article ID btr189, pp. 1696–1697, 2011. View at Publisher · View at Google Scholar · View at Scopus
  10. J. E. Reid and L. Wernisch, “STEME: efficient em to find motifs in large data sets,” Nucleic Acids Research, vol. 39, no. 18, article e126, 2011. View at Publisher · View at Google Scholar · View at Scopus
  11. M. Hu, J. Yu, J. M. G. Taylor, A. M. Chinnaiyan, and Z. S. Qin, “On the detection and refinement of transcription factor binding sites using ChIP-Seq data,” Nucleic Acids Research, vol. 38, no. 7, pp. 2154–2167, 2010. View at Publisher · View at Google Scholar · View at Scopus
  12. I. V. Kulakovskiy, V. A. Boeva, A. V. Favorov, and V. J. Makeev, “Deep and wide digging for binding motifs in ChIP-Seq data,” Bioinformatics, vol. 26, no. 20, pp. 2622–2623, 2010. View at Publisher · View at Google Scholar · View at Scopus
  13. M. Thomas-Chollier, C. Herrmann, M. Defrance, O. Sand, D. Thieffry, and J. Van Helden, “RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets,” Nucleic Acids Research, vol. 40, no. 4, article e31, 2012. View at Publisher · View at Google Scholar · View at Scopus
  14. A. A. Sharov and M. S. H. Ko, “Exhaustive search for over-represented DNA sequence motifs with cisfinder,” DNA Research, vol. 16, no. 5, pp. 261–273, 2009. View at Publisher · View at Google Scholar · View at Scopus
  15. J. S. Liu, A. F. Neuwald, and C. E. Lawrence, “Bayesian models for multiple local sequence alignment and Gibbs sampling strategies,” Journal of the American Statistical Association, vol. 90, no. 432, pp. 1156–1170, 1995. View at Publisher · View at Google Scholar
  16. R. Staden, “Methods to define and locate patterns of motifs in sequences,” Computer Applications in the Biosciences, vol. 4, no. 1, pp. 53–60, 1988. View at Google Scholar · View at Scopus
  17. M. L. Bulyk, P. L. F. Johnson, and G. M. Church, “Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors,” Nucleic Acids Research, vol. 30, no. 5, pp. 1255–1261, 2002. View at Publisher · View at Google Scholar · View at Scopus
  18. P. V. Benos, M. L. Bulyk, and G. D. Stormo, “Additivity in protein-DNA interactions: how good an approximation is it?” Nucleic Acids Research, vol. 30, no. 20, pp. 4442–4451, 2002. View at Publisher · View at Google Scholar · View at Scopus
  19. M. T. Lee, M. L. Bulyk, G. A. Whitmore, and G. M. Church, “A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays,” Biometrics, vol. 58, no. 4, pp. 981–988, 2002. View at Publisher · View at Google Scholar · View at MathSciNet
  20. J. Fischer, V. Heun, and S. Kramer, “Optimal string mining under frequency constraints,” in Knowledge Discovery in Databases: PKDD 2006, J. Fürnkranz, T. Scheffer, and M. Spiliopoulou, Eds., vol. 4213 of Lecture Notes in Computer Science, pp. 139–150, Springer, Berlin, Germany, 2006. View at Publisher · View at Google Scholar
  21. Y. Zhang, H. Huo, and Q. Yu, “A heuristic cluster-based em algorithm for the planted (l, d) problem,” Journal of Bioinformatics and Computational Biology, vol. 11, no. 4, Article ID 1350009, 19 pages, 2013. View at Publisher · View at Google Scholar · View at Scopus
  22. Q. Yu, H. Huo, X. Chen, H. Guo, J. S. Vitter, and J. Huan, “An efficient motif finding algorithm for large DNA data sets,” in Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM '14), pp. 397–402, IEEE, Belfast, UK, November 2014. View at Publisher · View at Google Scholar
  23. M. J. Mason, K. Plath, and Q. Zhou, “Identification of context-dependent motifs by contrasting ChIP binding data,” Bioinformatics, vol. 26, no. 22, pp. 2826–2832, 2010. View at Publisher · View at Google Scholar · View at Scopus
  24. T. L. Bailey, M. Bodén, T. Whitington, and P. Machanick, “The value of position-specific priors in motif discovery using MEME,” BMC Bioinformatics, vol. 11, article 179, 2010. View at Publisher · View at Google Scholar · View at Scopus
  25. S. Georgiev, A. P. Boyle, K. Jayasurya, X. Ding, S. Mukherjee, and U. Ohler, “Evidence-ranked motif identification,” Genome Biology, vol. 11, no. 2, article R19, 2010. View at Publisher · View at Google Scholar · View at Scopus
  26. C. T. Harbison, D. B. Gordon, T. I. Lee et al., “Transcriptional regulatory code of a eukaryotic genome,” Nature, vol. 431, no. 7004, pp. 99–104, 2004. View at Publisher · View at Google Scholar · View at Scopus
  27. X. Chen, H. Xu, P. Yuan et al., “Integration of external signaling pathways with the core transcriptional network in embryonic stem cells,” Cell, vol. 133, no. 6, pp. 1106–1117, 2008. View at Publisher · View at Google Scholar · View at Scopus
  28. P. Cartwright, C. McLean, A. Sheppard, D. Rivett, K. Jones, and S. Dalton, “LIF/STAT3 controls ES cell self-renewal and pluripotency by a Myc-dependent mechanism,” Development, vol. 132, no. 5, pp. 885–896, 2005. View at Publisher · View at Google Scholar · View at Scopus
  29. J. Jiang, Y.-S. Chan, Y.-H. Loh et al., “A core Klf circuitry regulates self-renewal of embryonic stem cells,” Nature Cell Biology, vol. 10, no. 3, pp. 353–360, 2008. View at Publisher · View at Google Scholar · View at Scopus
  30. N. Ivanova, R. Dobrin, R. Lu et al., “Dissecting self-renewal in stem cells with RNA interference,” Nature, vol. 442, no. 7102, pp. 533–538, 2006. View at Publisher · View at Google Scholar · View at Scopus
  31. J. Kim, J. Chu, X. Shen, J. Wang, and S. H. Orkin, “An extended transcriptional network for pluripotency of embryonic stem cells,” Cell, vol. 132, no. 6, pp. 1049–1061, 2008. View at Publisher · View at Google Scholar · View at Scopus
  32. M. Thomas-Chollier, M. Defrance, A. Medina-Rivera et al., “RSAT 2011: regulatory sequence analysis tools,” Nucleic Acids Research, vol. 39, supplement 2, pp. W86–W91, 2011. View at Publisher · View at Google Scholar · View at Scopus
  33. T. Bailey and C. Elkan, “Fitting a mixture model by expectation maximization to discover motifs in biopolymers,” in Proceedings of the International Conference on Intelligent Systems for Molecular Biology (ISMB '94), August 1994.
  34. G. Pavesi, G. Mauri, and G. Pesole, “An algorithm for finding signals of unknown length in DNA sequences,” Bioinformatics, vol. 17, supplement 1, pp. S207–S214, 2001. View at Publisher · View at Google Scholar · View at Scopus
  35. T. L. Bailey, “DREME: motif discovery in transcription factor ChIP-seq data,” Bioinformatics, vol. 27, no. 12, Article ID btr261, pp. 1653–1659, 2011. View at Publisher · View at Google Scholar · View at Scopus
  36. G. Bourque, B. Leong, V. B. Vega et al., “Evolution of the mammalian transcription factor binding repertoire via transposable elements,” Genome Research, vol. 18, no. 11, pp. 1752–1762, 2008. View at Publisher · View at Google Scholar · View at Scopus