Abstract

With 35,000 genes and hundreds of thousands of protein states to identify, correlate, and understand, it no longer suffices to rely on studies of one gene, gene product, or process at a time. We have entered the “omic” era in biology. But large-scale omic studies of cellular molecules in aggregate rarely can answer interesting questions without the assistance of information from traditional hypothesis-driven research. The two types of science are synergistic. A case in point is the set of pharmacogenomic studies that we and our collaborators have done with the 60 human cancer cell lines of the National Cancer Institute’s drug discovery program. Those cells (the NCI-60) have been characterized pharmacologically with respect to their sensitivity to > 70,000 chemical compounds. We are further characterizing them at the DNA, RNA, protein, and functional levels. Our major aim is to identify pharmacogenomic markers that can aid in drug discovery and design, as well as in individualization of cancer therapy. The bioinformatic and chemoinformatic challenges of this study have demanded novel methods for analysis and visualization of high-dimensional data. Included are the color-coded “clustered image map” and also the MedMiner program package, which captures and organizes the biomedical literature on gene-gene and gene-drug relationships. Microarray transcript expression studies of the 60 cell lines reveal, for example, a gene-drug correlation with potential clinical implications – that between the asparagine synthetase gene and the enzyme-drug L-asparaginase in ovarian cancer cells.