Research Article

Finding Top- Covering Irreducible Contrast Sequence Rules for Disease Diagnosis

Algorithm 1

The MineTopIRs Algorithm.
Input: a gene expression dataset ; the required number of rules ; the equivalent threshold ; the support threshold ;
the confidence threshold
Output: All top- covering irreducible contrast sequence rules for each sample with class label
()  Convert dataset into the EWave model , w.r.t. ;
()  Construct Head-Tail matrix;
()  Initiate a list of rules with both support and confidence values of 0, for each sample with class
label ;
()  Initiate the rule candidate set Candi_R with all 1-size sequence rules;
()  Call breathfirst_search(candi_R, , , 1);
()  Return for every with class label ;
Function: breathfirst_search(candi_R, , , )
()  while candi_R  do
()   foreach -size rule    generated based on the  -size rules in candi_R do
()    if -size subrule of    exists in candi_R then
()     if supp() > then Pruning rule 1;
()      if conf() < then
()       Pruning rule 2;
()       add into candi_R;
()      else
()       Check the th covering rule for each sample to find
       the lowest confidence minconf and the corresponding support sup;
()      if (conf() > minconf)(conf() = minconfsupp() ≥ sup) then
()      Pruning rule 3;
()      Update for each sample with
        based on Definitions 18 and 20;
()   end
()  end
()  Delete all the -size rules in candi_R;
()  ++;
() end