Research Article
Identifying and Assessing Interesting Subgroups in a Heterogeneous Population
while do | Shuffle the genes | Partition the genes into disjoint subsets | for to do | Perform hierarchical clustering on the th subset of the genes | for the number of clusters () do | Cut the dendrogram from the hierarchical clustering to yield clusters. | for to do | Take th cluster (a subtype identifier) | Run hierarchical clustering on patients using only the genes | in th cluster (Assume that this step yields clusters). | for to do | Perform two-sample -tests (e.g. relapse yes vs. no) | with the individuals in th cluster and the whole genes. | Fit the null distribution of the two sample -statistics | with known functional forms. | if | (Normal approximation for the null distribution is acceptable) then | Compute the FDR estimate based on the normal approximation. | end if | if | (Normal approximation for the null distribution is not acceptable) then | Compute the FDR estimate based on permutations. | end if | Compute our proposed FDR-based measure () | to assess the resulting cluster. | For a given subtype, compute the -value of by permuting group labels. | end for | end for | end for | end for | | end while |
|