Abstract

High-density mapping of mammalian genomes has enabled a wide range of genetic investigations including the mapping of polygenic traits, determination of quantitative trait loci, and phylogenetic comparison. Genome sequencing analysis of inbred mouse strains has identified high-density single nucleotide polymorphisms (SNPs) for investigation of complex traits, which has become a useful tool for biomedical research of human disease to alleviate ethical and practical problems of experimentation in humans. Nuclear factor (erythroid-derived 2)-like 2 (NRF2) encodes a key host defense transcription factor. This review describes genetic characteristics of human NRF2 and its homologs in other vertebrate species. NRF2 is evolutionally conserved and shares sequence homology among species. Compilation of publically available SNPs and other genetic mutations shows that human NRF2 is highly polymorphic with a mutagenic frequency of 1 per every 72 bp. Functional at-risk alleles and haplotypes have been demonstrated in various human disorders. In addition, other pathogenic alterations including somatic mutations and misregulated epigenetic processes in NRF2 have led to oncogenic cell survival. Comprehensive information from the current review addresses association of NRF2 variation and disease phenotypes and supports the new insights into therapeutic strategies.

1. Overview

The gene nuclear factor (erythroid-derived 2)-like 2 (NFE2L2) or more commonly the used synonym nuclear factor erythroid 2- (NF-E2-) related factor 2 (NRF2) and its mouse homolog (Nfe2l2, Nrf2) encode a ubiquitous transcription factor belonging to the basic leucine zipper (bZIP) protein family [1, 2]. NRF2 modulates downstream genes by binding to their cis-regulatory module antioxidant response elements (AREs). NRF2 targets include ARE-bearing effector genes such as reactive oxygen species (ROS) scavenging enzymes (e.g., superoxide dismutases, SODs), phase-2 defense enzymes (e.g., glutathione-S-transferase, GST; heme oxygenase-1, HO-1), drug efflux pumps (e.g., multidrug resistance proteins, MRPs), and various interacting and indirectly modulated proteins [36]. The NRF2-ARE pathway has emerged in mechanisms of human diseases in which oxidative stress is implicated. Importantly, three lines of gene-targeted (knockout) mice were generated by Drs. M. Yamamoto (Nfe2l2tm1Mym), Y. W. Kan (Nfe2l2tm1Ywk), and P. A. Ney (Nfe2l2tm1Ney) [79], and 124 gene-trapped or gene-targeted cell lines have been established (http://www.informatics.jax.org/searches/allele_report.cgi?markerID=MGI:108420). During the last decade or more, wide application of the knockout mice to human disease models has led to new insights into disease pathogenesis and therapeutic potential (Figure 1).

Kelch-like ECH-activating protein 1 (KEAP1 for humans, Keap1 for mice, or iNrf2 for rats) is a cytoplasmic suppressor of NRF2 and is critical in NRF2 homeostasis and activity [10]. Substantial efforts have led to the discovery of the molecular mechanisms of KEAP1-mediated NRF2 regulation. In unstressed conditions, the NRF2-bound KEAP1 homodimer is complexed to a ubiquitin ligase (Cullin 3-based E3 ligase), which polyubiquitinates NRF2 for proteasomal degradation and maintains NRF2 homeostasis (20 min half-life of cellular NRF2 [11]). However, modifications of KEAP1 (e.g., cysteine residues) and NRF2 (e.g., serine residues) under stressed conditions activate NRF2 by liberating it from a “hinge and latch” NRF2-KEAP1 affinity binding, allowing its nuclear translocation [12].

In the current review, I address genetic aspects of human NRF2 and its homologs in other vertebrate species. Sequence variations in human NRF2 and murine Nrf2 including single-nucleotide polymorphisms (SNPs) were collected from public databases and compiled. Mutations that have been associated with disease risks are defined. Nongenetic variations including somatic mutations and epigenetic modifications are also described. Although the current review does not deal with mutations in other species, recent characterization of nrf2 mutant zebrafish which were hypersensitive to environmental toxicants [13] also provides a useful investigational tool.

Homology scores of gene (coding DNA sequences, cds) and protein across 10 species were compared with human NRF2. The highest sequence homology (98%-99%) was with chimpanzee and rhesus monkey while the lowest similarity was found with zebrafish (Table 1). While there is approximately about 83% homology in cds and protein sequences of humans and rodents, 5′-untranslated regions (5′-UTR, UTR-5) of these strains extend differentially (114 bp in human, 233 bp in mouse, and 82 bp in rat), and the human 5′-UTR does not share significant sequence homology with either rat or mouse (the rat is 94% homologous with the 3′ portion of mouse 5′-UTR).

Human NRF2 is located in the cytogenetic band 2q31.2 of chromosome 2 spanning 178,095,031–178,129,859 bp as a complementary sequence (gene ID: 4780, Table 1). Murine Nrf2 maps as a complementary sequence to chromosome 2 C3 (44.75 centimorgan) and spans 75,675,519–75,704,641 (gene ID: 18024, Table 1). The complete cds of NRF2 is 2,859 bp, and there are 14 transcript variants reported (http://useast.ensembl.org/Homo_sapiens). Mouse Nrf2 mRNA spans 2,469 bp, and another variant has been reported (http://useast.ensembl.org/Mus_musculus). The human NRF2 protein (ID: NP_006155) contains 605 amino acid (aa) residues with molecular weight of 67.7 kDa (isoform 1), and total 12 isoforms are published (the National Center for Biotechnology Information, NCBI, http://www.ncbi.nlm.nih.gov/refseq/rsg/; e!Ensemble, http://useast.ensembl.org/index.html; UniProt Consortium, http://www.uniprot.org/uniprot/Q16236, http://www.uniprot.org/uniprot/Q60795). Mouse Nrf2 protein (ID: NP_035032) comprises 597 aa at 66.8 kDa (Table 1). Structurally, there are 6 NRF2-ECH homology (Neh) domains configuring the protein sequence of either species (Table 2), and the potential functions of each region, particularly the highly conserved KEAP1-binding Neh2 and DNA- (ARE-) binding Neh1 domains, have been intensively investigated [12, 14, 15].

3.1. Evolution, Genome Sequence, and Polymorphism Discovery in Human and Mouse

While rare and monogenic Mendelian diseases are inheritable mutations in a single gene [16], many common diseases are complex traits, and the disease phenotypes are affected by variants in multiple genetic loci. Recent advancements in high-throughput technology have enabled sequencing of entire mammalian genomes [1719], and information on DNA sequence and variation has facilitated the study of complex traits of human disorders. Genome-wide association studies (GWAS) examine whether SNPs are associated with important disease traits and ascertains “at-risk” genotypes that are significantly more prevalent in the affected group than in the nonaffected group. The HapMap Project (http://hapmap.ncbi.nlm.nih.gov/) has mapped combinations of alleles at specific loci (haplotypes), that is, common patterns of sequence variation in several human populations. It has supported efficient mapping of multiple loci for complex traits in GWAS. Candidate gene approaches based on findings from GWAS of similar disorders have also been useful for determining the potential genetic mechanisms of diseases.

The evolutionary divergence of human and mouse lineages occurred for roughly 75 million years, and their genome sequences have been altered by nearly one substitution (or deletion or addition) for every two nucleotides [20]. However, this slow evolution process resulted in a high degree of conservation across the two species, which allows alignment of orthologous sequences: >90% of the human and mouse genomes is partitioned into corresponding regions of conserved synteny, and at the nucleotide level approximately 40% of the human genome is aligned to the mouse [20]. Due to this fact, biomedical studies of human genes are complemented by experimental manipulation of corresponding mouse genes, and they have aided functional understanding of genes in human health. Following the 2003 completion of the Human Genome Project of approximately 3.1 giga base pairs (Gbp), the Mouse Genome Project assembled the complete genome sequence of one strain (C57BL/6J; 2,716,965,481 bp) in 2011. Using this reference strain, whole genome sequencing data across 16 additional inbred strains were done (http://www.sanger.ac.uk/resources/mouse/genomes/, [21]). Discovery of high-density SNPs in the mouse genome supports evolutionary history of the strain and provides a tool to investigate models of human disease processes that cannot often be practically achieved through direct human studies.

3.2. Genetic Mutations in Human NRF2

Human NRF2 codes three major isoforms of protein (Figure 2). Transcript variant 2 (NM_001145412.2, 2746 bp) has an alternate promoter, 5′-UTR, and a downstream start codon, compared to variant 1 (NM_006164.4, 2859 bp). It encodes an isoform 2 missing N-terminal 16 aa (NP_001138884 or Q16236-2, 589 aa) relative to isoform 1 (NP_006155.2 or Q16236, 605 aa). Isoform 3 (NP_001138885.1 or Q16236-3, 582 aa) is encoded by transcript variant 3 (NM_001145413.2; 2,725 bp) and lacks an internal segment relative to isoform 2 due to an alternate in-frame splice site in the 3′ coding region. In public databases, more than 583 sequence mutations are reported in NRF2 (34,827 bp) and 7,000 bp upstream (Table S1; data acquired as of December, 2012) (See Table S1 in Supplementary Material available on line at http://dx.doi.org/10.1155/2013/286524). NRF2 locates on 178,130,354-178,129,304 bp of GRCh37.p10 Primary Assembly, and Figure 2 shows sequences of proximal promoter (−1 to −500), partial mRNA variant 1 including 5′-UTR (exon 1, up to TSS), and protein isoform 1 (NP_006155, encoded by variant 1 NM_006164.4: 556–2,373 bp). Based on the current assembly and sequence update, previous promoter positions −686 [22]/−653 [23] are identified as −214; [22]/ [23] as ; and [22]/ [23] as −178. Overall frequency of NRF2 SNPs and other mutations is about 1 per 72 bp. The genetic mutations include 37 in the 5′ flanking promoter and 59 in exons (Tables 3 and S1). Among exon SNPs, 26 are nonsynonymous (Cns) mutations. A triplet repeat variation (rs143406266; GCC4  versus GCC5, previously published as / ) in the 5′-UTR was uniquely identified in Asian populations [22, 24].

3.3. SNPs, Haplotypes, and Association with Disease Risk

The use of gene knockout mice in model systems has provided potential insights into the role of NRF2 in the pathogenesis of various human disorders (see Figure 1). Recent epidemiological and association studies have revealed significant associations of NRF2 sequence variations with disease risks, which further supports NRF2 as a susceptibility gene. Most of the phenotype-associated variants are in the promoter region and presumed to be involved in NRF2 gene regulation. Table 4 summarizes NRF2 SNP and/or haplotype alleles that have been associated with oxidant-related disease risks. Interestingly, there is no evidence for exon SNPs as at-risk alleles. For convenience and consistency, intronic and 3′ distal SNP alleles are presented as chromosome contig (HGVS) alleles while promoter and exon SNPs are presented as reversed contig alleles throughout the text.

Pulmonary Diseases. NRF2 SNPs in the promoter and intron 1 sequences have been investigated for their potential associations with risk of pulmonary critical disorders including acute lung injury (ALI), cigarette smoke-induced chronic obstructive pulmonary disease (COPD), and asthma. A heterozygous C/A SNP at −178 position (rs6721961T>C or T>G, previously −617 or −650) significantly increased the risk for developing ALI following major trauma in European and African-American populations (odds ratio, OR 6.44; 95% confidence interval, CI 1.34–30.8; allelic frequency = 11.9% at 21/180) [23]. Promoter activity of the A allele (A/C or A/A) determined in vivo and in vitro was significantly lower than C/C allele at that locus (−178 in an ARE-like motif) indicating that it is a functional SNP for autoregulation [23]. The −178G/G were also nominally associated with ALI-related 28-day mortality following systemic inflammatory response syndrome [37]. In a Japanese cohort, SNP haplotype (rs2001350T/rs6726395A/rs1962142A/rs2364722A/rs6721961T) containing the −178A/A homozygote was associated with an annual decline of rapid forced expiratory volume in one second (FEV1) in relation to cigarette-smoking status [34]. In addition, a promoter and 5′-UTR SNP haplotype consisting of −214 G allele (52%, rs35652124, previously −686/−653), −212G/G (98%, rs67006649, previously −684/−651), −178C allele (73%), and GCC4 (53%) was predicted to increase respiratory failure development (hazard ratio = 0.95, CI 0.91–0.99) in German COPD patients [32]. Significant interaction was also identified between an intronic SNP G allele (rs6726395, g.178103229A>G, 88.4% frequency) and smoking status on FEV1 decline, relative to the reference A/A allele, in the above Japanese cohort [34]. Siedlinski et al. [39] reported that the C/C genotype of another intronic SNP (rs2364723, g.178126546G>C) was associated with a lower FEV1 level compared to the wild-type genotype (G/G) in two Netherland cohorts (CI, −63.6−17.8, frequency = 0.525, and pooled cohort size = 2,542). This SNP alone or as a haplotype with 4 more intronic SNPs (rs13001694G/rs1806649T/rs4243387T/rs6726395G) was also associated with high FEV1 levels in individuals that ever smoked [39]. In a Hungarian population of childhood asthma, SNPs at −178 (C/A) and 3′ flanking (rs2588882T/G) loci were inversely associated with infection-induced asthma (OR 0.437; CI 0.28–0.80, OR 0.290; CI 0.13–0.62, resp.), and these SNPs significantly influenced an asthma-environmental pollution interaction [38]. The intronic SNP rs1806649 (C>T) was associated but not significantly with an increased risk of hospitalization during high-level particulate matter (PM10) periods in asthma or COPD patients ( ) of the United Kingdom (UK) [41]. Asthma and COPD admission rates were related to the increase in environmental PM10 concentration. Importantly, effects of interaction between prenatal stress and NRF2 SNPs on descendant pulmonary health were investigated by the Avalon Longitudinal Study in the UK: maternal smoking during pregnancy was not associated with lung function change determined by maximum mild expiratory flow ( ) or with asthma incidence in school-aged children, and this relation was not modified by NRF2 SNP genotypes [42]. However, early gestation acetaminophen exposure significantly influenced the risk of asthma and wheezing at the age of 7 years in >4,000 mothers and >5,000 children [33]. When maternal copies of the −212A allele were present, association with asthma (1,137/4,891; OR 1.73, CI 1.22–2.45) and wheezing (1,149/4,949; OR 1.53, 95% CI 1.06–2.20) was significantly increased [33].

Gastrointestinal Disorders. While there was no evidence in lung cancer cases, studies in Japanese populations suggested a potential association of NRF2 variations with gastro-intestinal tumorigenesis. Helicobacter pylori (H. pylori) causes gastritis which can lead to gastric atrophy and cancer. In gastric epithelium from the Japanese cancer cohorts (39 gastric cancers, 46 controls), H. pylori infection was positively correlated with aberrant CpG island methylation of tumor suppressor genes (e.g., p14), and −214G/−212G or −214A/−212G NRF2 haplotype was significantly associated with increased (OR 2.90; 95% CI 1.14–7.36) or decreased (OR 0.33; 95% CI 0.13–0.88) risk of the CpG methylation, respectively, in the H. pylori-infected patients [29]. Further study from the same investigators determined that −214A/−212G allele carriers had significantly ( ) reduced risk of gastric cancer in H. pylori-negative cases [30]. The −214A/G−212A/G genotypes were negatively associated (OR 0.45, CI 0.22–0.93), and the −214G−212G genotypes were positively associated (chronic continuous phenotype; OR 2.57, CI 1.01–6.60) with ulcerative colitis (89 patients, 141 controls) in a Japanese population [31].

Autoimmune Disorders. Systemic lupus erythematosus (SLE) is a long-term autoimmune disease more frequently found in females than in males. It affects organs including skin, joints, kidneys, and brain, and nephritis is an aggressive characteristic in some patients. Genome-wide association studies in humans identified a suggestive quantitative trait locus near NRF2 [43]. A study of a Mexican Mestizo population (362 patients with childhood-onset SLE, 379 controls, and 212 nephritis diagnosed) determined that lupus with nephritis was significantly (OR 1.81, CI 1.04–3.12) associated with the −214G/A SNP in females [25]. The same SNPs were not closely associated with SLE risk in a Japanese cohort [22]. Vitiligo is a skin condition in which there is a loss of brown color (pigment) from areas of skin, resulting in irregular white patches. It is thought to be an autoimmune disease caused by loss of cells (melanocytes) that produce brown pigment. A study indicated that the −178A allele increased the risk of vitiligo dose-dependently (OR 1.724, 95% CI 1.35–2.21 for C/A; OR 2.902, CI 1.62–5.19 for A/A) [35].

Female Disorders. It is well known that estrogen metabolites (e.g., catechols) cause ROS formation suggesting correlation of NRF2 and downstream effectors in postmenopausal mammary cancer. In a study of a Finish population (Kuopio Breast Cancer Project, patients, 370 controls), the −178A/A homozygous genotype (OR 4.656; CI = 1.35–16.06) and 3′ flanking rs2706110 (T/T; OR 2.079, CI 1.18–3.68) genotype were associated with increased risk of breast cancer, while the 5′ flanking −3,306T/T homozygous allele was significantly associated with lower survival (frequency = 71/219, OR 1.687, CI 1.105–2.75) [27], suggesting that NRF2 genetic polymorphisms affect susceptibility and outcome of the patients. The −178A allele carriers together with intronic rs1962142A allele carriers were associated with lowered tissue levels of NRF2 proteins [27]. In postmenopausal women, the −178A allele (OR 17.9; 95% CI 3.70–85.70) appeared to modify the risk of venous thromboembolism caused by oral estrogen therapy (A/A or A/C frequency = 33.3%) as demonstrated by the French ESTHER study (161 cases, 474 controls) [36]. An intronic rs1806649C>T SNP did not associate with breast cancer risk in postmenopausal women [40]. However, when this SNP and other at-risk alleles of ARE-responsive genes (NQO1, HO-1, NOS3) were combined, there was a significant gene-dose effect on the breast cancer risk [40]. Although coding region SNPs in NRF2 and KEAP1 were identified in the Japanese endometrial adenocarcinoma patients, no association of NRF2 SNPs with the disease was found [44].

Neurodegenerative Diseases.  Oxidative stress is known to be involved in Parkinson’s disease (PD) presumably due to production of ROS from high-dopamine metabolism and low levels of antioxidants in the substantia nigra of the brain. Investigators found a protective NRF2 haplotype consisting of four 5′ flanking SNPs (−5238G/−214A/−212G/−178C) and 4 intronic SNPs (rs2886161A/rs1806649A/rs2001350A/rs10183914A) from Swedish (OR 0.9, CI 0.60–1.40) and Polish (OR 0.4, CI 0.30–0.60) populations (total PD cases, controls) [26]. The investigators also suggested that NRF2 haplotype alleles were associated with 2 years earlier age of Alzheimer’s disease (AD) onset, 4 years earlier age of posterior subcapsular cataract surgery, and 4 years later age of cortical cataract surgery while they were not significantly related to AD or age-related cataract risk [45].

3.4. Genetic Mutations in Mouse Nrf2

Tsang et al. [46] compiled 673 SNPs in 55 mouse strains and constructed their phylogenetic tree to correlate and clarify the origins of strains based on the assembled mouse genome sequence and SNP data [20, 47, 48]. Recently, using the complete genome sequence of C57BL/6J (B6) mouse as a reference, high-density SNP screening in other laboratory strains or in panels of strains has been published (see [17]). Although millions of mouse SNPs (>10,089,892 as of December 2012) and haplotype mappings from more than 120 strains have been published as valuable references for dissecting the genetic basis of complex traits [4951], little attention has been paid to polymorphisms of Nrf2 and their correlation with disease phenotypes.

Figure 3 demonstrates the proximal promoter region (−1 to −950)/5′-UTR (exon 1, up to TSS) and protein sequence of mouse Nrf2 based on GRCm38.p1 Primary Assembly (75,704,641–75,675,513 bp), mRNA variant 1, and protein (NP_035032, encoded by NM_010902: 234–2,027 bp) sequences. Genetic variations in the Nrf2 genome of inbred strains collected from public databases are listed in Tables S2 and 5. (See Supplementary Table S2 in Supplementary Material available online at http://dx.doi.org/10.1155/2013/286524.) Overall, 968 genetic mutations are compiled for Nrf2 gene and 5 kb upstream/2 kb downstream regions: 785 SNPs between B6 and another 16 strains were acquired from the Mouse Phenome Database (MPD, http://phenome.jax.org/db/q?rtn=snp/ret1), and additional SNPs and other mutations were acquired from NCBI dbSNP (http://www.ncbi.nlm.nih.gov/snp/?term=mus+musculus%20nfe2l2). In total, 132 mutations are in the promoter (37 in proximal 1 kb), 49 in exons (38 in coding region, 19 Cns), 727 in introns, and 60 in the 3′ flanking region. Excluding mutations in the 5′ and 3′ flanking sequences, murine Nrf2 sequences appear to be more highly variable (1 variation per 37.5 bp) than much of the mouse genome which has an approximate frequency of one SNP per every 245 bp (http://www.informatics.jax.org/mgihome/homepages/stats/all_stats.shtml#allstats_snp).

Nrf2 was found to be a susceptibility gene from genome-wide linkage analysis in a murine model of hyperoxia-induced ALI [52]. A promoter SNP −103T>C (previously published as −336T>C) in Nrf2 was found and predicted to add an additional Sp1 binding site in hyperoxia-susceptible B6 mice, but not in resistant C3H/HeJ mice [52]. Genotypes from the SNP and from simple-sequence length polymorphism markers of the Nrf2 locus (D2Mit248 and D2Mit94) cosegregated in the B6C3F2 mouse cohort [52], and Nrf2 deficient mice were significantly more susceptible to ALI sub-phenotypes caused by hyperoxia than similarly exposed wild-type mice, supporting Nrf2 as a contributor to the phenotypic traits [53]. Although no other functional analyses on Nrf2 SNPs or haplotype association studies have been conducted in inbred mice, strains bearing haplotypes such as multiple Cns in functional domains (e.g., F71L, L451V, H543Q, and L575M) may be useful to elucidate the role of Nrf2 in differential susceptibility to oxidative diseases.

Somatic mutation is a change in the DNA of somatic cells that affects derived cells but is not inherited by offspring. Efforts to discover somatic mutations have provided insight into mutagenesis and cancer development. Lung cancer, particularly non-small cell lung cancer (NSCLC), is the leading cause of cancer death worldwide. Somatic mutations of NRF2 and KEAP1 discovered in lung cancer patients have determined the oncogenic potential of NRF2 [54, 55]. KEAP1 somatic mutations were associated with its reduced protein levels in lung cancer tissues and cells [56, 57]. Investigations of NSCLC in various ethnic populations as well as cancers in gastrointestine, breast, and prostate have coordinately demonstrated that multiple Cns somatic mutations in KEAP1 cause dysfunction of the translated protein and in turn constitutive activation of NRF2, increasing risk of neoplasia and chemoresistance [12, 55, 58, 59]. Somatic mutations of NRF2 have been detected in various cancer tissues (largely squamous cell carcinomas) in Asian populations (Table 6). NRF2 mutations were significantly associated with NSCLC cases (squamous cell lung carcinoma, adenocarcinoma) of the Japanese (10.7%, [54]), the Chinese (23%, [60]), and the Koreans (8%, [61]) as well as with lung cancer cell lines. Smoking history was also correlated with mutation occurrence in all of the studies [54, 60, 61]. In addition to lung cancers, laryngeal squamous carcinoma (13% in [61]), esophageal squamous cancer (ESC, 22% in [60], 11.4% in [61]), head and neck cancers (25% in [54]), skin (1/17 case in [61]), and oral cancer cell lines had somatic changes in NRF2. In contrast to wide-spread KEAP1 mutations, mutations in NRF2 were clustered in DLG/ETGE motifs of the Neh2 domain, which are critical in the “hinge and latch” model of KEAP1 binding [12]. Similar to KEAP1 somatic mutations, it has been postulated that NRF2 mutations in cancer cells lead to NRF2 accumulation by suppressing its ubiquitination or KEAP1 binding, which eventually confers malignant potential and resistance to chemotherapy.

Most variable sites in NRF2 included aa residues 29 (Asp, D), 31 (Gly, G), 77 (Asp, D), and 79 (Glu, E) (Table 6). Residue 33 (Ser, S) in the Neh2 domain is mutated by either genetic or somatic processes (Figure 4). Cns in the EDGF motif of NRF2 was experimentally determined to impair recognition of KEAP1 [54]. NRF2 mutations were significantly correlated with increased (2.5-fold) copy number (31% of mutants versus 3% wild types) in Japanese NSCLC cases [63]. Aberrant mutation of NRF2 also led to increased expression of downstream effectors including RagD known to be involved in squamous lung cancer cell proliferation [64], suggesting that the mutation is functional and overcomes KEAP1 inhibition. Singh et al. [65] determined in vitro that RNAi-mediated depletion of NRF2 in lung cancer cells enhanced ROS production and susceptibility to cell death by ionizing radiation. These studies support the concept that elevated NRF2 and ARE responsiveness provides cancer cells with proliferative advantage for malignant transformation and undue protection from anti-cancer therapy. Oncogenic epidermal growth factor receptor (EGFR) signaling is recently found to be critical in NRF2-mediated proliferation of NSCLC cells [66].

Collectively, “gain of function” mutations in NRF2 that reduce KEAP1 recognition are suggested to be predictive markers for poor responsiveness to chemotherapy and radiation therapy. Although NRF2-mediated cellular defense processes are essential in the initiation stage, enhanced NRF2-ARE activity in advanced stages of cancer development may create a favorable intracellular environment for tumor cell growth and survival [67, 68]. In this context, NRF2 may be a potential molecular target for the treatment of radio-resistance cancers, especially those that have “loss of function” mutations in EGFR, KRAS, or KEAP1 as well as “gain of function” mutations in NRF2.

Epigenetic modifications are alterations of molecules interacting with genes without changes to the primary DNA sequence. They include post-translational modification of histones, DNA methylation events, chromatin conformational changes, and alterations to noncoding regulatory RNAs. Epigenetic alterations are stable and often inheritable but are reversible and may affect expression of the gene. Dysregulation or defects in epigenetic processes, particularly hypermethylation of tumor suppressor gene promoters (e.g., CpG islands) or histone modifications, are thought to be associated with carcinogenesis. Investigators have reported hypermethylation in CpG islands of KEAP1 which were associated with reduced KEAP1 expression in human cancers from lung, prostate, colon, and so forth, [6972]. Similar to somatic mutations, epigenetic changes on KEAP1 impaired the function of its encoded protein leading to constitutive NRF2 activation.

Supporting the role of “pathogenic mutations” in NRF2, expression of Nrf2 and downstream Nqo1 was suppressed in prostate tumors of mice (transgenic adenocarcinoma of mouse prostate, TRAMP). Among 15 promoter CpG islands located between −942 and −654 (c.−1175_c.−1132 and c.−1059_c.−887, gap in c.−1131_c.−1060; see Figure 3), hypermethylation of the first 5 CpG islands (−942_−899 and c.−1175_c.−1132) was significantly associated with tumorigenesis [73]. Moreover, treatment with inhibitors for DNA methyltransferase and histone deacetylase restored Nrf2 expression in these tumor cells [73]. A dietary phytochemical curcumin known as a DNA hypomethylation agent restored epigenetically silent Nrf2 expression through CpG demethylation in carcinogen-induced mouse tumor cells [74].

The whole genome epigenetic datasets for 5 species are publicly accessible at NCBI Epigenomics [75, 76]. The human NRF2 epigenome of primary cells (breast, penis) and H1 stem cell line as well as mouse Nrf2 CpG island methylation data for sperm, blood, and cerebellum are currently available (http://www.ncbi.nlm.nih.gov/epigenomics). Although no direct evidence of disease-associated epigenetic modulation has been identified in human NRF2, various phytochemical NRF2 agonists such as sulforaphane and curcumin have shown their roles in DNA methylation and histone modification (see reviews by Lee and colleagues, e.g., [77]). Taken together, epigenetic modifications of the NRF2/KEAP1 axle are predicted to cause dysregulation of ARE-mediated cellular defense leading to deleterious health effects, and phytochemical antioxidants as epigenetic modulators for NRF2 are suggested to be useful in cancer prevention.

6. Conclusions

NRF2 is evolutionally conserved with high-sequence homology in many species. However, it is a highly mutable gene, and numerous genetic variants have been discovered in human ethnic groups. Importantly, certain SNPs or haplotypes have been identified in various diseases as “at-risk” alleles and are related to functional alterations. In addition to genetic variations, multiple somatic mutations identified in the KEAP1 recognition domain of NRF2 in cancer cells have been found to be oncogenic due to dysregulation of NRF2 homeostasis by its excess “gain of function”. Epigenetic alteration of the NRF2 is under investigation and is predicted to have pathogenic influences as learned from mouse and phytochemical agonist studies. Continuous updates of Nrf2 allelic variants in inbred mouse strains will provide a useful tool for effective experimental designs for models of oxidative disorders to provide insight into the disease mechanisms and intervention strategies.

Abbreviations

AD: Alzheimer’s disease
ALI: acute lung injury
ARE:antioxidant response element
bZIP: basic leucine zipper
cds:coding DNA sequences
CI:confidence interval
COPD: chronic obstructive pulmonary disease
ESC: esophageal squamous cancer
FEV1:forced expiratory volume in one second
Gbp:giga base pairs
GST:glutathione S-transferase
GWAS:genome-wide association study
HGVS:Human Genome Variation Society
HO-1:heme oxygenase-1
H.  pylori:Helicobacter  pylori
KEAP1:Kelch-like ECH activating protein 1
MPD: Mouse Phenome Database
MRP:multidrug resistance protein
NCBI:National Center for Biotechnology Information
Neh:NRF2-ECH homology
Nfe2l2:nuclear factor (erythroid-derived  2)-like 2
Nrf2: NF-E2-related factor 2
NSCLC:nonsmall cell lung cancer
NQO1:NAD(P)H:quinone oxidoreductase 1
OR:odds ratio
PD:Parkinson’s disease
PM:particulate matter
ROS:reactive oxygen species
SLE:systemic lupus erythematosus
SNP:single nucleotide polymorphism
SOD:superoxide dismutase
URT:untranslated region.

Disclosure

Author’s contribution to the Work was done as part of the Author’s official duties as a NIH employee and is a Work of the United States Government. Therefore, copyright may not be established in the United States.

Conflict of Interests

The author declares that there is no conflict of interests.

Acknowledgments

The research related to this paper was supported by the Intramural Research Program of the National Institutes of Health of the National Institute of Environmental Health Sciences (NIEHS). Drs. Steven Kleeberger and Stephanie London at the NIEHS provided excellent critical review of this paper. The author thanks Mrs. Jacqui Marzec for her helpful comments and English editing.

Supplementary Materials

Supplementary Table S1 includes genetic mutations in human NRF2 focus (7 kb upstream included) compiled from public database. Supplementary Table S2 includes genetic mutations in murine NRF2 focus (5 kb upstream and 2 kb downstream included) collected from public database for 17 inbred strains.

  1. Supplementary Material