Abstract

Both genetic and epigenetic alterations characterize human nonsmall cell lung cancer (NSCLC), but the biological processes that create or select these alterations remain incompletely investigated. Our hypothesis posits that a roughly reciprocal relationship between the propensity for promoter hypermethylation and a propensity for genetic deletion leads to distinct molecular phenotypes of lung cancer. To test this hypothesis, we examined promoter hypermethylation of 17 tumor suppressor genes, as a marker of epigenetic alteration propensity, and deletion events at the 3p21 region, as a marker of genetic alteration. To model the complex biology between these somatic alterations, we utilized an item response theory model. We demonstrated that tumors exhibiting LOH at greater than 30% of informative alleles in the 3p21 region have a significantly reduced propensity for hypermethylation. At the same time, tumors with activating KRAS mutations showed a significantly increased propensity for hypermethylation of the loci examined, a result similar to what has been observed in colon cancer. These data suggest that NSCLCs have distinct epigenetic or genetic alteration phenotypes acting upon tumor suppressor genes and that mutation of oncogenic growth promoting genes, such as KRAS, is associated with the epigenetic phenotype.

1. Introduction

Lung cancer remains one of the most incident cancers and the leading cause of cancer death in both men and women in the United States. In 2006, there were over 162 000 deaths attributable to lung cancer in the U.S. [1]. The major cause of lung cancer is tobacco smoking, although environmental tobacco smoke, asbestos, and other environmental and industrial exposures also contribute to lung carcinogenesis [2]. Nonsmall cell lung cancer (NSCLC) is derived from the epithelial cells of the lung and bronchus and is characterized by a wide variety of molecular alterations. Included among these are activating mutations in oncogenes (such as cMYC, KRAS, EGFR, CCND1, and BCL2) and inactivating lesions in tumor suppressor genes [3, 4]. The inactivation of tumor suppressor genes occurs chiefly through allele loss (both single allele loss and homozygous gene deletion) or epigenetic silencing associated with promoter hypermethylation of CpG islands in the promoters of many of these genes [57].

The relationship between these various genetic and epigenetic alterations has remained relatively unexplored, although it is now known that there is a surprisingly large number of genetic alterations evident in solid tumors [8]. A number of groups have reported that alterations to TP53, measured both as mutation of the gene or through altered immunohistochemical staining, are associated with greater prevalence of LOH at various loci [911]. There have also been links made between lung carcinogen exposures and several genetic alterations. For example, LOH of the FHIT gene is found more often in smokers and those exposed to asbestos in an occupational setting [12, 13]. Work from our group and others has also suggested that carcinogen exposures may drive the type of alterations observed in specific genes, such as CDKN2A (encoding P16INK4A), which is more often deleted in never smoking individuals [14] but has an increasing prevalence of hypermethylation with increasing duration of tobacco smoking [15, 16]. Likewise, mutation of the EGFR gene occurs specifically in adenocarcinomas, and most often in women and never smoking patients [1720]. This type of evidence reveals that distinct molecular phenotypes exist in NSCLC and that exposure, lifestyle, or a combination of these factors drive these phenotypes. Better definition of the number and character of these phenotypes may be critical for making clinical decisions about treatment course for patients, as has been evidenced in the EGFR mutation case [2124]. Thus, this study was aimed at better defining the molecular phenotypes in NSCLC, closely examining the relationship between genetic and epigenetic alterations in this disease.

2. Subjects and Methods

2.1. Study Population

Eligible cases consisted of all newly diagnosed patients with resectable lung cancer who received treatment at the Massachusetts General Hospital Thoracic Surgery, Oncology, and Pulmonary Services from November 1992 through December 1996 [25]. The patients involved in these studies provided written informed consent under a protocol approved by the appropriate Institutional Review Boards. Patients with recurrent disease or nonoperable tumors were excluded. A random subset of 260 was analyzed for somatic loss (LOH) at 3p21. A subset of 185 patients of the parent study had fresh lung tumor tissue obtained for use in hypermethylation analysis. Tumor tissue was snap frozen in liquid nitrogen and stored at until processed. Demographic and epidemiological data, including all of the data on tobacco use, were gathered by interviewer review of a self-administered questionnaire completed by patients and reviewed by a single reviewer during the hospitalization for thoracic surgery.

2.2. 3p21 Loh Analysis

Analysis of allelic loss at the 3p21 region at microsatellite markers D3S1029, D3S3582, D3S3667, D3S3640, D3S1568, and D3S3026, and D3S1478 and the details of the statistical analysis and construction of the fraction allele loss (FAL) score have been previously described for this population [9, 26].

2.3. Hypermethylation Analysis

DNA was extracted from fresh frozen tumor tissue using the Gentra Puregene DNA extraction kit (Gentra, Minneapolis, Minn, USA) following manufacturer’s protocol. Sodium bisulfite modification of the resultant DNA was performed as previously described [27]. Briefly, DNA was denatured in NaOH and treated with sodium bisulfite (Sigma Chemical Co., St. Louis, Miss, USA) for 16 hours. The DNA was then purified using a Wizard DNA Clean-Up Kit (Promega, Madison, Wiss, USA), treated again with NaOH, and ethanol precipitated. DNA was rehydrated in water for subsequent use in PCR amplification.

We have specifically chosen to utilize traditional methylation specific PCR (MSP) [27] for the analysis of promoter hypermethylation in these studies. We have previously examined potential biases in the sensitivity of using this assay against the relative-quantitative Taqman-based methods [28], and have seen no evidence for potential bias based on tumor quantity or tumor stage in the samples analyzed.

Sodium bisulfite modified DNA was used as the template for methylation specific PCR (MSP) as previously described in [27] using primers specific for the methylated promoters of CDKN2A [27], RASSF1A [29], APC [30], PYCARD [31], LAMC2 [32], SFRP1, SFRP2, SFRP4, SFRP5 [33], MGMT, DAPK, RARB, CDH1 [34], CDH13 [35], MLH1 [36], CCND2 [37], and PRSS3 [38]. All methylation specific PCRs are optimized to detect greater than 5% methylated substrate in each sample. To control for the presence of modified DNA, primers specific to a modified region of the ACTB gene containing no CpG sites were used [39]. Modified circulating blood lymphocyte DNA (obtained from a control subject) and the same lymphocyte DNA completely methylated using SssI DNA methylase and modified by treatment with sodium bisulfite were used as the negative and positive controls, respectively, in each run.

These genes were chosen as promoter hypermethylation detected using this method has been previously shown to be correlated to transcriptional silencing of these genes, and their hypermethylation occurs in a tumor-specific pattern. We also wished to examine the silencing of tumor suppressor genes involved in a variety of cellular processes and pathways thereby not limiting the analysis to genes involved in a single pathway targeted for inaction. The genes selected are known to be involved in processes including cell cycle control (CDKN2A, RASSF1A, APC, and CCND2), apoptosis (DAPK, PYCARD), extracellular interactions (LAMC2, PRSS3), transcriptional regulation (RARB), WNT signaling (SFRP family), cell-cell signaling (CDH1, CDH13), and DNA repair (MGMT, MLH1).

2.4. Statistical Analysis

We have previously demonstrated that there are no discreet groupings of tumors by the number of genes undergoing hypermethylation [5] and we have observed that there is a great deal of correlation between methylation of the individual loci, such that simply counting the number of methylated loci in an individual tumor is statistically inappropriate. Therefore, we employed an item response theory (IRT) model [40] which has been shown to be the most appropriate latent trait technique [41] for examining this type of discrete data [42] and which allows the modeling of the propensity for methylation to be treated as a continuous variable in a regression framework.

To construct our IRT model, we used a stepwise selection combined with domain knowledge examining the effect of exposures on the latent methylation variable. Age and gender are included in the model as there have been reports of age-related methylation [43, 44], and there is a well-established difference in the prevalence of bladder cancer by gender. Histology (adenocarcinoma, squamous cell carcinoma, and others) was initially included in the model. We examined the effects of exposures which have been demonstrated to be associated with lung cancer incidence including cigarette smoking and occupational asbestos exposure. Cigarette smoking was examined as a variable comparing never, former, and current smokers as well as using measures of duration and intensity. We also examined the effect of mutation of the KRAS gene which has been previously reported in this population of tumors [25, 45], as mutation of this pathway is associated with a methylator phenotype in colon cancer [46]. As a marker of genetic alteration, we included the FAL score of 3p21 LOH. Variables were excluded from the model by using the AIC to determine the most parsimonious model. The final model included gender, KRAS mutation, and 3p21 FAL (low 0.3 versus high 0.3) as covariates in addition to the promoter hypermethylation status of the 17 genes was examined.

We also examined the associations between propensity for hypermethylation, again characterized by the methylation latent trait, and patient survival. In a second stage analysis, we used the predictions of the methylation latent trait and employed a Cox proportional hazards model to examine the logarithm of the hazard of death as a linear function of covariates, including the methylation latent trait. All analyses were conducted in R version 2.4.1 [47], including custom software for the IRT model, available upon request.

3. Results

The demographics of the study population are shown in Table 1. As expected, the majority of patients were either current or former smokers, with a mean age of approximately 67 years. Greater than 50% of the tumors were adenocarcinomas, with about 34% squamous cell carcinoma, and the remaining of the rarer histologies such as large cell. As previously noted, approximately 17% of the cases had mutation of KRAS at codon 12 [45], the hotspot for lung cancer, and about 34% had loss of heterozygosity (LOH) at greater than 30% of informative alleles in the 3p21 region [9].

Figure 1(a) presents the prevalence of hypermethylation of the 17 gene promoters examined in this study. The prevalence of this alteration appears highly variable by the gene examined, with some genes, such as SFRP1 and SFRP2 found to be hypermethylated in nearly 80% of cases, while other genes, such as MLH1 or APC, rarely exhibiting hypermethylation in this series. There was moderate correlation between methylation at the individual loci, as depicted in Figure 1(b), where lighter shades represent correlation coefficients approaching 1. As we observed, this relatively high correlation between methylation events, and the broad range of prevalence of these alterations at different genes, rather than treat these events as independent in the analysis, we employed an item response theory model to investigate the propensity for hypermethylation in these tumors. This approach takes into account this correlation and has been shown to be appropriate for similar data examined in bladder cancer [42].

The results of the item response model are listed in Table 2, which provides the coefficient values of the slopes and intercepts for each of the genes examined in this series. Although the values of the intercepts are not directly interpretable, they are provided for completeness as they are used in plotting the relationship between the latent trait and the probability of methylation. The slope, on the other hand, indicates the strength of the relationship between the latent trait and probability of methylation of the gene. The range of values for the slope terms of these genes suggests that genes contribute differentially to the modeled underlying propensity for methylation. Comparisons with the overall prevalence demonstrate that the model is not solely driven by prevalence, as genes with relative high prevalence, such as RASSF1A, which are methylated in approximately 50% of cases, in fact have a nonsignificant and negative item response slope, while CCND2 (encoding Cyclin-D2), with a prevalence of methylation of approximately 30%, has a highly significant slope of 1.3.

This model also examined the impact of exposures, demographics, tumor characteristics, and other molecular alterations on the underlying propensity for methylation. We found that only KRAS mutation status and fraction allelic loss score of LOH at 3p21 were significantly associated with the propensity for hypermethylation. Gender was kept in the model as it is a known confounder of KRAS mutation status [25]. Having mutation of KRAS leads to an approximately 0.5 unit increase in latent trait mean (where latent trait is scaled to have unit standard deviation) ( ). On the other hand, having a 3p21 LOH fraction allele loss score of 0.3, indicating that greater than 30% of informative loci examined in the region demonstrated LOH, was associated with a statistically significant reduction by 0.5 unit in the methylation latent trait mean ( ). We found no significant association between overall patient survival and the methylation latent trait in a second stage analysis.

4. Discussion

The existence of a CpG island methylator phenotype (CIMP) has been demonstrated in a number of tumor types, most conclusively in colorectal, and gastric cancers [44, 4850]. In colon cancer, the CIMP phenotype is associated with genetic mutation of the BRAF gene, and it has also been suggested that this phenotype, particularly through its associated methylation of the MLH1 gene, is responsible for mismatch repair deficiency and thus the microsatellite instability observed in a subset of colorectal cancers [46]. Mutation of the KRAS gene has also been associated with CIMP in colorectal cancer [48, 50]. Our recent work has also suggested that tumors from a number of other sites, including the lung, may exhibit differences in their underlying propensity for hypermethylation [5]. Assessment of the methylator phenotype in lung cancer is complicated by a lack of understanding as to which genes should be assessed to determine this phenotype, as those used in colon cancers may not be appropriate. Thus, we have sought to more thoroughly examine what factors may be driving this propensity, and if there are specific molecular phenotypes extant in NSCLC.

As we previously demonstrated, there are no discreet groupings of tumors by the number of genes undergoing hypermethylation [5], a finding that has led some to question the existence of a methylator phenotype [51]. We have also shown (Figure 1(b)) that there is a great deal of correlation between methylation of the individual loci, such that simply counting the number of methylated loci in an individual tumor is statistically inappropriate. Therefore, in order to employ appropriate statistical methodologies as well as avoid biased separation of tumors into classes based on arbitrary counts, we employed an item response theory model which has been shown to be the most appropriate latent trait technique for examining this type of discrete data [42] and which allows the propensity for methylation to be treated as a continuous variable in a regression framework.

Using this approach, we observe differences in the contribution to the underlying methylation latent trait by the different loci examined, an observation that would be lost if the methylation events were simply counted and thus considered equal. This suggests that some genes, such as SFRP2, SFRP5, and CCND2, may be more informative (due to their significant and greater item response slopes) than genes with nonsignificant or small item response slopes, such as RASSF1A or CDKN2A. This is of interest as CDKN2A (encoding P16INK4A) is often considered as a marker of CIMP in colorectal cancer [49], but in NSCLC, silencing of this gene appears less informative for predicting the overall propensity for methylation. CDKN2A has also been shown to be one of the earliest genes identified to become hypermethylated during lung carcinogenesis, with detection possible in epithelial cells from smokers prior to lung cancer diagnosis [52]. Thus, alteration of this gene may be more important in the early stages of carcinogenesis and may be common in tumors irrespective of their overall propensity for methylation.

We observed that tumors exhibiting more extensive LOH at the 3p21 region showed a lower propensity for methylation. This is particularly interesting, as it is often thought that LOH and methylation may be occurring together since 2 hits are needed to inactivate tumor suppressors. Further, the tumor suppressor RASSF1A, which is located at 3p21, was included in the panel of genes in our model, suggesting that the deletion phenotype and methylation phenotype are perhaps even less correlated than we report here. Our results, from this examination, our previous examination of LOH at 3p21 [9], and our work showing p16 deletion and methylation to be reciprocal [14], suggest that allele loss in the 3p21 region, beyond identifying the site of a specific NSCLC-specific tumor suppressor gene, is a more general marker of a propensity for genetic inactivation of tumor suppressor genes via allelic imbalance or loss. Our data demonstrates that tumors predisposed to this type of genetic alteration are less likely to undergo epigenetic alterations and vice versa. That is, our result suggests that tumor suppressor gene silencing in NSCLCs arises either through a preponderance of allele loss events or epigenetic silencing events, occurring in a roughly dichotomous fashion.

Our model also demonstrates that KRAS mutation is associated with a greater propensity for methylation, a result similar to that reported in colon cancer [48, 50]. Figure 2 presents a model describing this overall relationship. The mechanism that underlies the connection between KRAS mutation and increased methylation propensity remains unclear; indeed, the same can be said for the established connection between BRAF mutation and CIMP, although it is known that these two oncogenes operate in the same cell signaling pathway [48, 53]. Our study as well as previous studies cannot absolutely discern which event is occurring first, mutation of these genes or establishment of greater promoter methylation, but one can speculate that the oncogenic activation of KRAS may drive cell division, thus increasing the possibility for alteration of promoter methylation profiles. At the same time, silencing of genes involved in DNA damage or repair through promoter hypermethylation may allow for the propagation of mutation of these oncogenes. Approaches employing model systems with oncogenic mutation of these growth promoting genes may help to shed light on the timing of these events and help to answer this question.

A limitation of this study was the use of a panel of selected loci and a limited regional examination of deletion. Thus, our results may be indicative of gene-specific selection events and not of the broader phenotypes that we have proposed. To more conclusively examine, this requires genomic-level approaches be applied for examination of the epigenetic and genetic character within the same tumor, and new technologies are becoming available which can allow for this type of examination on clinical specimens. As these technologies develop, it will be critical to employ appropriate and rigorous statistical methods, such as those used here, to analyze the data to allow for an understanding of the biology driving these alterations. Our results, though, do provide an impetus to test these hypotheses using these more genome-wide approaches.

More broadly, these results suggest the existence of distinct molecular pathways of tumor suppressor gene inactivation in sporadic tumors. Classifying tumors based on these pathways may lead to improved understanding of the etiology of these diseases, as it may improve the overall classification of disease for genetic association studies. For example, one may posit that individuals with polymorphisms in, perhaps, genes involved in DNA repair (particularly recombinational repair) that lead to reduced repair capacity, may be at higher risk for the genetically altered subclass of tumors. This improved subclassification also holds tremendous clinical utility as it may help to define patients’ response to specific chemotherapeutic regimens. In this series, we did not observe any relationship between the methylation latent trait and overall patient survival, but we could not examine specific subgroups based on treatment, which could be critical for understanding this relationship.

Acknowledgments

This work was supported by National Institutes of Health Grants ES05974, ES007373, and CA100679 and Flight Attendants Medical Research Institute Young Clinical Scientist Award (C. J. Marsit).