BioMed Research International

Research Article

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

Input: a test set and a control set .
Output: the set of motifs
() // the set of motifs
() // the set of emerging substrings
() // the set of the qualified neighborhood instances
() // the set of PWMs
() // the set of intra-motif distributions
() For to do
() For each -mer of substrings: do
() if (, ) ≥ && (, , ) ≥ then
() add to
() For each -mer of : do
() For each to do
() calculate -score of each neighborhood instance
() if z() > 1.643 then
() Add to
() use and to construct and
() add to set and add to set
() For each of do
() if sim(, ) ≥ 0.75 () then
() cluster with and delete from .
() if FDR() > 0.2 then
() delete from
() use and corresponding to compute IC.
() add formed by of top 50 IC score to
() return