Research Article

A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

Algorithm 1

Input: a test set and a control set .
Output: the set of motifs
()  // the set of motifs
()  // the set of emerging substrings
()  // the set of the qualified neighborhood instances
()  // the set of PWMs
()  // the set of intra-motif distributions
() For   to   do
()   For  each -mer of substrings:   do
()    if  (, ) ≥  && (, , ) ≥   then
()    add to
()    For  each -mer of :   do
()   For  each to   do
()   calculate -score of each neighborhood instance
()   if  z() > 1.643  then
()    Add to
()  use and to construct and
()   add to set and add to set
()   For  each of   do
()   if  sim(, ) ≥ 0.75 ()  then
()   cluster with and delete from .
()   if  FDR() > 0.2 then
()  delete from
()   use and corresponding to compute IC.
()   add formed by of top 50 IC score to
()   return