Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2016, Article ID 4986707, 10 pages
http://dx.doi.org/10.1155/2016/4986707
Research Article

PairMotifChIP: A Fast Algorithm for Discovery of Patterns Conserved in Large ChIP-seq Data Sets

1School of Computer Science and Technology, Xidian University, Xi’an 710071, China
2School of Electronic Engineering, Xidian University, Xi’an 710071, China

Received 22 June 2016; Revised 4 September 2016; Accepted 27 September 2016

Academic Editor: Yudong Cai

Copyright © 2016 Qiang Yu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Linked References

  1. P. D'Haeseleer, “What are DNA sequence motifs?” Nature Biotechnology, vol. 24, no. 4, pp. 423–425, 2006. View at Publisher · View at Google Scholar · View at Scopus
  2. V. Matys, E. Fricke, R. Geffers et al., “TRANSFAC®: transcriptional regulation, from patterns to profiles,” Nucleic Acids Research, vol. 31, no. 1, pp. 374–378, 2003. View at Publisher · View at Google Scholar · View at Scopus
  3. D. GuhaThakurta, “Computational identification of transcriptional regulatory elements in DNA sequence,” Nucleic Acids Research, vol. 34, no. 12, pp. 3585–3598, 2006. View at Publisher · View at Google Scholar · View at Scopus
  4. A. Mathelier, W. Shi, and W. W. Wasserman, “Identification of altered cis-regulatory elements in human disease,” Trends in Genetics, vol. 31, no. 2, pp. 67–76, 2015. View at Publisher · View at Google Scholar · View at Scopus
  5. C.-W. Huang, W.-S. Lee, and S.-Y. Hsieh, “An improved heuristic algorithm for finding motif signals in DNA sequences,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 8, no. 4, pp. 959–975, 2011. View at Publisher · View at Google Scholar · View at Scopus
  6. P. A. Evans, A. D. Smith, and H. T. Wareham, “On the complexity of finding common approximate substrings,” Theoretical Computer Science, vol. 306, no. 1–3, pp. 407–430, 2003. View at Publisher · View at Google Scholar · View at MathSciNet · View at Scopus
  7. F. Zambelli, G. Pesole, and G. Pavesi, “Motif discovery and transcription factor binding sites before and after the next-generation sequencing era,” Briefings in Bioinformatics, vol. 14, no. 2, Article ID bbs016, pp. 225–237, 2013. View at Publisher · View at Google Scholar · View at Scopus
  8. E. R. Mardis, “ChIP-seq: welcome to the new frontier,” Nature Methods, vol. 4, no. 8, pp. 613–614, 2007. View at Publisher · View at Google Scholar · View at Scopus
  9. H. Li and N. Homer, “A survey of sequence alignment algorithms for next-generation sequencing,” Briefings in Bioinformatics, vol. 11, no. 5, pp. 473–483, 2010. View at Publisher · View at Google Scholar · View at Scopus
  10. H. Kim, J. Kim, H. Selby et al., “A short survey of computational analysis methods in analysing ChIP-seq data,” Human Genomics, vol. 5, no. 2, pp. 117–123, 2011. View at Publisher · View at Google Scholar · View at Scopus
  11. G. Pavesi, P. Mereghetti, G. Mauri, and G. Pesole, “Weeder web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes,” Nucleic Acids Research, vol. 32, pp. W199–W203, 2004. View at Publisher · View at Google Scholar · View at Scopus
  12. Q. Yu, H. Huo, Y. Zhang, and H. Guo, “PairMotif: a new pattern-driven algorithm for planted (l, d) DNA motif search,” PLoS ONE, vol. 7, no. 10, Article ID e48442, 2012. View at Publisher · View at Google Scholar · View at Scopus
  13. Q. Yu, H. Huo, Y. Zhang, H. Guo, and H. Guo, “PairMotif+: a fast and effective algorithm for de novo motif discovery in DNA sequences,” International Journal of Biological Sciences, vol. 9, no. 4, pp. 412–424, 2013. View at Publisher · View at Google Scholar · View at Scopus
  14. T. L. Bailey, N. Williams, C. Misleh, and W. W. Li, “MEME: discovering and analyzing DNA and protein sequence motifs,” Nucleic Acids Research, vol. 34, pp. W369–W373, 2006. View at Publisher · View at Google Scholar · View at Scopus
  15. M. Nicolae and S. Rajasekaran, “Efficient sequential and parallel algorithms for planted motif search,” BMC Bioinformatics, vol. 15, no. 1, article 34, 2014. View at Publisher · View at Google Scholar · View at Scopus
  16. M. Nicolae and S. Rajasekaran, “qPMS9: an efficient algorithm for quorum planted motif search,” Scientific Reports, vol. 5, article 7813, 2015. View at Google Scholar
  17. M. K. Das and H.-K. Dai, “A survey of DNA motif finding algorithms,” BMC Bioinformatics, vol. 8, supplement 7, article S21, 2007. View at Publisher · View at Google Scholar · View at Scopus
  18. C. Jia, M. B. Carson, Y. Wang, Y. Lin, and H. Lu, “A new exhaustive method and strategy for finding motifs in ChIP-enriched regions,” PLoS ONE, vol. 9, no. 1, Article ID e86044, 2014. View at Publisher · View at Google Scholar · View at Scopus
  19. F. Zambelli and G. Pavesi, “A faster algorithm for motif finding in sequences from ChIP-Seq data,” in Proceedings of the 8th International Meeting of Computational Intelligence Methods for Bioinformatics and Biostatistics, pp. 201–212, Gargnano del Garda, Italy, June 2011.
  20. M. Thomas-Chollier, C. Herrmann, M. Defrance, O. Sand, D. Thieffry, and J. van Helden, “RSAT peak-motifs: motif analysis in full-size ChIP-seq datasets,” Nucleic Acids Research, vol. 40, no. 4, article e31, 2012. View at Publisher · View at Google Scholar · View at Scopus
  21. A. A. Sharov and M. S. H. Ko, “Exhaustive search for over-represented DNA sequence motifs with CisFinder,” DNA Research, vol. 16, no. 5, pp. 261–273, 2009. View at Publisher · View at Google Scholar · View at Scopus
  22. Q. Yu, H. Huo, X. Chen, H. Guo, J. S. Vitter, and J. Huan, “An efficient algorithm for discovering motifs in large DNA data sets,” IEEE Transactions on Nanobioscience, vol. 14, no. 5, pp. 535–544, 2015. View at Publisher · View at Google Scholar · View at Scopus
  23. P. Machanick and T. L. Bailey, “MEME-ChIP: motif analysis of large DNA datasets,” Bioinformatics, vol. 27, no. 12, Article ID btr189, pp. 1696–1697, 2011. View at Publisher · View at Google Scholar · View at Scopus
  24. J. E. Reid and L. Wernisch, “STEME: efficient em to find motifs in large data sets,” Nucleic Acids Research, vol. 39, no. 18, article e126, 2011. View at Publisher · View at Google Scholar · View at Scopus
  25. A. Lihu and Ş. Holban, “A review of ensemble methods for de novo motif discovery in ChIP-Seq data,” Briefings in Bioinformatics, vol. 16, no. 6, pp. 964–973, 2015. View at Publisher · View at Google Scholar · View at Scopus
  26. Q. Yu, H. Huo, R. Zhao, D. Feng, J. S. Vitter, and J. Huan, “Reference sequence selection for motif searches,” in Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM '15), pp. 569–574, IEEE, Washington, DC, USA, November 2015. View at Publisher · View at Google Scholar
  27. J. Buhler and M. Tompa, “Finding motifs using random projections,” Journal of Computational Biology, vol. 9, no. 2, pp. 225–242, 2002. View at Publisher · View at Google Scholar · View at Scopus
  28. S. van Dongen, Graph clustering by flow simulation [Ph.D. thesis], Utrecht University, Utrecht, The Netherlands, 2000.
  29. P. A. Pevzner and S. H. Sze, “Combinatorial approaches to finding subtle signals in DNA sequences,” in Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, pp. 269–278, Menlo Park, Calif, USA, August 2000.
  30. X. Chen, H. Xu, P. Yuan et al., “Integration of external signaling pathways with the core transcriptional network in embryonic stem cells,” Cell, vol. 133, no. 6, pp. 1106–1117, 2008. View at Publisher · View at Google Scholar · View at Scopus
  31. M. Tompa, N. Li, T. L. Bailey et al., “Assessing computational tools for the discovery of transcription factor binding sites,” Nature Biotechnology, vol. 23, no. 1, pp. 137–144, 2005. View at Publisher · View at Google Scholar · View at Scopus
  32. G. E. Crooks, G. Hon, J.-M. Chandonia, and S. E. Brenner, “WebLogo: a sequence logo generator,” Genome Research, vol. 14, no. 6, pp. 1188–1190, 2004. View at Publisher · View at Google Scholar · View at Scopus