Computational and Mathematical Methods in Medicine

Research Article

Finding Top- Covering Irreducible Contrast Sequence Rules for Disease Diagnosis

The MineTopIRs Algorithm.

Input: a gene expression dataset ; the required number of rules ; the equivalent threshold ; the support threshold ;
the confidence threshold
Output: All top- covering irreducible contrast sequence rules for each sample with class label
() Convert dataset into the EWave model , w.r.t. ;
() Construct Head-Tail matrix;
() Initiate a list of rules with both support and confidence values of 0, for each sample with class
label ;
() Initiate the rule candidate set Candi_R with all 1-size sequence rules;
() Call breathfirst_search(candi_R, , , 1);
() Return for every with class label ;
Function: breathfirst_search(candi_R, , , )
() while candi_R do
() foreach -size rule generated based on the -size rules in candi_R do
() if ∀ -size subrule of exists in candi_R then
() if supp() > then Pruning rule 1;
() if conf() < then
() Pruning rule 2;
() add into candi_R;
() else
() Check the th covering rule for each sample to find
the lowest confidence minconf and the corresponding support sup;
() if (conf() > minconf)∨(conf() = minconf∧supp() ≥ sup) then
() Pruning rule 3;
() Update for each sample with
based on Definitions 18 and 20;
() end
() end
() Delete all the -size rules in candi_R;
() ++;
() end