Research Article

Identifying and Assessing Interesting Subgroups in a Heterogeneous Population

Algorithm 1

CAMS.
while       do
 Shuffle the genes
 Partition the genes into disjoint subsets
for    to    do
  Perform hierarchical clustering on the th subset of the genes
  for the number of clusters ()    do
   Cut the dendrogram from the hierarchical clustering to yield clusters.
   for    to    do
    Take th cluster (a subtype identifier)
    Run hierarchical clustering on patients using only the genes
    in th cluster (Assume that this step yields clusters).
    for    to    do
     Perform two-sample -tests (e.g. relapse yes vs. no)
     with the individuals in th cluster and the whole genes.
     Fit the null distribution of the two sample -statistics
     with known functional forms.
     if
    (Normal approximation for the null distribution is acceptable) then
      Compute the FDR estimate based on the normal approximation.
     end if
     if
    (Normal approximation for the null distribution is not acceptable) then
      Compute the FDR estimate based on permutations.
     end if
     Compute our proposed FDR-based measure ()
     to assess the resulting cluster.
     For a given subtype, compute the -value of by permuting group labels.
    end for
   end for
  end for
end for
end while