About this Journal Submit a Manuscript Table of Contents
BioMed Research International
Volume 2013 (2013), Article ID 617545, 7 pages
http://dx.doi.org/10.1155/2013/617545
Research Article

Exploring the Cooccurrence Patterns of Multiple Sets of Genomic Intervals

1Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA
2Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, GA, USA
3Center for Comprehensive Informative, Emory University, Atlanta, GA, USA

Received 27 March 2013; Accepted 4 May 2013

Academic Editor: Zhongming Zhao

Copyright © 2013 Hao Wu and Zhaohui S. Qin. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Background. Exploring the spatial relationship of different genomic features has been of great interest since the early days of genomic research. The relationship sometimes provides useful information for understanding certain biological processes. Recent advances in high-throughput technologies such as ChIP-seq produce large amount of data in the form of genomic intervals. Most of the existing methods for assessing spatial relationships among the intervals are designed for pairwise comparison and cannot be easily scaled up. Results. We present a statistical method and software tool to characterize the cooccurrence patterns of multiple sets of genomic intervals. The occurrences of genomic intervals are described by a simple finite mixture model, where each component represents a distinct cooccurrence pattern. The model parameters are estimated via an EM algorithm and can be viewed as sufficient statistics of the cooccurrence patterns. Simulation and real data results show that the model can accurately capture the patterns and provide biologically meaningful results. The method is implemented in a freely available R package giClust. Conclusions. The method and the software provide a convenient way for biologists to explore the cooccurrence patterns among a relatively large number of sets of genomic intervals.