Biomedical Informatics and Computational Biology for High-Throughput Data AnalysisView this Special Issue
NCK2 Is Significantly Associated with Opiates Addiction in African-Origin Men
Substance dependence is a complex environmental and genetic disorder with significant social and medical concerns. Understanding the etiology of substance dependence is imperative to the development of effective treatment and prevention strategies. To this end, substantial effort has been made to identify genes underlying substance dependence, and in recent years, genome-wide association studies (GWASs) have led to discoveries of numerous genetic variants for complex diseases including substance dependence. Most of the GWAS discoveries were only based on single nucleotide polymorphisms (SNPs) and a single dichotomized outcome. By employing both SNP- and gene-based methods of analysis, we identified a strong (odds ratio = 13.87) and significant (P value = ) association of an SNP in the NCK2 gene on chromosome 2 with opiates addiction in African-origin men. Codependence analysis also identified a genome-wide significant association between NCK2 and comorbidity of substance dependence (P value = ) in African-origin men. Furthermore, we observed that the association between the NCK2 gene (P value = ) and opiates addiction reached the gene-based genome-wide significant level. In summary, our findings provided the first evidence for the involvement of NCK2 in the susceptibility to opiates addiction and further revealed the racial and gender specificities of its impact.
Substance dependence is believed to result from a combination of genetic and environmental factors. Since substance dependence is a chronic brain disease, with high relapse rates, it causes serious social, economic, and medical consequences [1–3]. The World Health Organization (WHO) and the United Nations Office on Drugs and Crime (UNODC) reported that opiates dependence is associated with a high risk of HIV infection when opiates are injected using contaminated injection equipment . Paulozzi et al. in 2006 reported that the number of deaths which involved prescription opioid analgesics increased from 2,900 in 1999 to at least 7,500 in 2004, an increase of 160% in just 5 years . All available evidence indicated that the increasing numbers of deaths are significantly correlated to the increasing use of prescription drugs, especially opioid painkillers, among people during the working years of life. While exposure to drugs is the prerequisite for addiction, the most important question is as follows: who will be addicted after the exposure? Genes are believed to be a major factor, although it is most likely that there are multiple genes as well as gene-environment interactions. For this reason, understanding the genetic mechanisms behind vulnerability to drug addiction is critical to improve the quality of overall health and life.
Linkage and genome-wide association studies (GWASs) have implicated many regions and genes for dependence on alcohol, tobacco, and opiates. GABRA2, CHRM2, ADH4, PKNOX2, GABRG3, TAS2R16, SNCA, OPRK1, and PDYN have all been associated with alcohol dependence with various degrees of replication [6–21]. Associations of other candidate alcohol dependence genes, such as KIAA0040, ALDH1A1, and MANBA [18, 20, 22–25], remain to be confirmed. Several groups reported CHRNA5, CHRNA3, CHRNB4, and CSMD1 to be associated with nicotine dependence [26–34]. Meanwhile, recent studies also reported that a group of genes, such as OPRM1 [35–37], OPRD1, OPRK1 [21, 38, 39], HTR1B , SLC6A4 , GABRG2 , and BDNF , to be associated or in linkage with opiates addiction.
Complex diseases may involve heterogeneous genetic effects in different ethnic and gender groups [7, 44–47]. Luo et al.  reported that African-origin smokers become dependent at a lower threshold (number of cigarettes per day) than European-origin smokers. Hartel et al.  found that men are more vulnerable to addiction when compared to women. In addition, Chen et al.  revealed that PKNOX2 is associated with drug addiction in European-origin women. These examples underscore the necessity to consider demographic or even other covariates in genetic association studies.
Many of the reported genetic variants have been identified through single SNP association tests. Despite many of the successes, a single SNP tends to have a small effect, and the single SNP-based association tests require a very stringent significance level, which is likely a key factor to the so-called “missing heritability” problem [48, 49]. To overcome some of these limitations, gene-based analysis [50–52] has emerged to jointly analyze the SNPs within genes. Gene-based methods are less affected by the heterogeneity of a single locus; hence the results may be more robust across populations , which increases the likelihood of replication. Hence, we performed both single SNP-based and gene-based association analyses for the data from the Study of Addiction: Genetics and Environment (SAGE)  which includes well-characterized phenotypic data on substance dependence including addiction to nicotine, alcohol, marijuana, cocaine, opiates, and other drugs. In our analysis, we find a genome-wide significant association of NCK2 gene on chromosome 2 with opiates dependence in African-origin men at both the SNP and gene levels. NCK2 is a member of NCK family of adaptor proteins, which is associated with tyrosine-phosphorylated growth factor receptors of their cellular substrates . However, to the best of our knowledge, NCK2 has not been reported to be associated with any drug addiction outcomes in humans.
2. Materials and Methods
Phenotypes for multisubstance dependency and genome-wide SNP data from SAGE  were downloaded from dbGaP (http://www.ncbi.nlm.nih.gov/gap). SAGE is a large case-control association study which investigates the genetic variants for drug addiction. The samples were collected from three large-scale genome-wide association studies: Collaborative Study on the Genetic of Alcoholism (COGA), the Family Study of Cocaine Dependence (FSCD), and the Collaborative Genetic Study of Nicotine Dependence (COGEND) [16, 44, 55, 56]. The original data set contains 4,121 subjects with six categories of substance dependence data: addiction to alcohol, cocaine, marijuana, nicotine, opiates, and other drugs. Lifetime dependence on these six substances is diagnosed by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV). The genotyping was performed by the illumina Human 1 M platform. In this study, we followed a quality control/quality assurance process similar to previous analyses [7, 57]. Individuals with call rates <90% and SNPs with minor allele frequency MAF <1% were excluded from the analysis. The value for the Hardy-Weinberg equilibrium was set up by >0.0001. These steps reduced the level of noise in genotypes and increased the efficiency of analysis. There are 60 duplicate genotype samples and 9 individuals with ethnic backgrounds other than African origin or European origin. All of those individuals were removed from the subject list. Finally, there were a total of 3,627 unrelated samples with 859,185 autosomal SNPs for our final analysis. To alleviate the confounding by population substructure, we stratified the sample by race and sex. Finally, there are four sub-samples: 1,393 European-origin women, 1,131 European-origin men, 568 African-origin women and 535 African-origin men. The distribution of subjects diagnosed with lifetime dependence on substances in each of the six categories: nicotine, alcohol, marijuana, cocaine, opiates, or other drugs are presented in Table 1.
Figure 1 displays the flow chart of our analytic strategy, and the details of the association analysis methods are described later.
3.1. Statistical Analysis for Single Trait
The SNP-based association is performed by the standard allelic test and logistic regression to obtain the values for individual SNPs, and PLINK software (version 1.07) was used for analysis . Meanwhile, a list of SNP pairs in linkage disequilibrium (LD) () is calculated for the gene-based association test.
For the gene-based analysis, we used the open-source tool: Knowledge-Based Mining System for Genome-Wide Genetic Studies (KGG, version 2.0)—based on the SNP association test results and LD files produced by PLINK. The procedure was performed as the following. We first calculate the effective number of independent value among SNPs within a gene. Then, we sort the SNPs and calculate the effective number of independent -values among the top significant SNPs. Finally, the modified Simes test  was employed to obtain a gene-based value as follows, where is the th most significant among the SNPs within a gene. We refer the interested readers to  for details.
In the gene-based method, SNPs within 20 kilo bases (kb) 5′ upstream and 10 kilo bases (kb) 3′ downstream of a gene’s coding regions  were assigned to the gene. In addition, we included other SNPs if they are in strong LD with the initially mapped SNPs within the gene .
Since there are about 20,000 protein coding genes in human genome, we used as the genome-wide significance threshold for the gene-based association test. In contrast, we used as the genome-wide significance threshold for the SNP-based association test .
3.2. Codependence Association Analysis
Although logistic regression is commonly used to study a binary outcome, it is not suitable to evaluate comorbidity involving multiple outcomes. We use a nonparametric association test based on Kendall’s tau  to study the comorbidity. The Kendall’s tau-based association test proceeds as follows.
Suppose that we observe a -dimensional vector of traits , genotype , and a -dimensional vector of covariates for the th subject in a population-based study with subjects, and are independent samples. For subjects and , let and be their vectors of traits, respectively, and analogously, and and and are their genotypes and covariates. Generalized from Kendall’s tau, a statistic is defined to measure the association between and as follows: Without considering the covariates and conditioning on all phenotypes, follows an asymptotically normal distribution in the absence of association . To accommodate covariates, a weighted statistic has been developed [64, 65]. We refer to Jiang and Zhang  for a detailed description of the method. For the purpose of comparison, we present the results with and without considering age as the covariate. Recall that our analysis is stratified by ethnicity and gender.
4.1. Association Analysis at SNP Level
Table 2 summarizes the top four significant SNPs (with ) in gene NCK2 on chromosome 2 (2q12) for opiates dependence in African-origin men. We identified a genome-wide significant SNP (rs2377339 with ) for the opiates dependence in African-origin men by the allelic test. Logistic regression also yielded strong evidence for the association between the SNP rs2377339 () and opiates dependence although the -value did not reach the genome-wide significance threshold. In addition, Table 2 presents the association results for the other five addictions with the four candidate SNPs. None of the four SNPs appeared significantly associated with the other five substance addictions.
4.2. Association Analysis at Gene Level
The gene-based association results are displayed in the last two rows of Table 2. Specifically, we included 39 SNPs in NCK2. The values from the gene-NCK2-based tests that were obtained through the standard allelic test and logistic regression are and , respectively. The gene-based value from the standard allelic test reached the genome-wide significance at gene level. The gene-based value through logistic regression is very close to the gene-based genome-wide significance level. Therefore, both methods provided significant evidence that supports the association between the NCK2 gene and opiates dependence in African-origin men. For the addiction of the other five substances in African-origin men, nicotine dependence had the most significant association with the NCK2 gene .
4.3. Haplotypes Analysis
We also examined association of haplotypes with opiate addiction in NCK2 region. Figure 2 displays the linkage disequilibrium (LD) heat map of 14 SNPs in 28 kb region . Haplotype “AGTTCAGATCTCGT” with probability 0.016 yielded a value of . The genome-wide significant association between this haplotype and opiate addiction reduces the chance of a false discovery at the peak of a single SNP.
4.4. Contingency Table Analysis
We further examined the relationship between SNP rs2377339 and the opiates dependence in African-origin men. Table 3 depicts the allele frequencies of SNP rs2377339. The proportion of individuals having minor allele G is 21.43% in the case group and 1.63% in the control group. The odds ratio of SNP rs2377339 is 13.87, indicating that those who have the risk allele (G) for rs2377339 are at a significantly increased risk of being diagnosed with opiates dependence.
4.5. Stratification Analysis
Furthermore, in Table 4, we investigated the racial specificity and sex difference in the association between SNP rs2377339 and opiates dependence. This scrutiny required us to include all racial and gender groups. We observed that the MAF and values vary between different races and genders. The association between rs2377339 and opiates dependence becomes less significant in the overall cohort, after we adjusted race and gender in logistic regression.
4.6. Codependence Association Analysis
In Table 5, we also presented the association results for NCK2 and comorbidity of substance dependence. The most significant signal in NCK2 was observed for SNP rs2377339 in men of African-origin with in adjusted association test and in unadjusted association test. values of SNPs in NCK2 for other ethnicity by gender groups were far from the genome-wide significance level and, hence, are omitted here.
We found a genome-wide significant association between SNP rs2377339 and opiates dependence in African-origin men. The NCK2 gene that contains SNP rs2377339 also achieved the genome-wide significance for opiates dependence at the gene level. For the addiction of the other five substances, nicotine dependence had the most significant association but not significant at the genome-wide level.
NCK2, a member of NCK family of adaptor proteins, is reported to be associated with tyrosine-phosphorylated growth factor receptors of their cellular substrate . The association between NCK2 and nicotine dependence has been suggested in humans [67, 68]. Our finding coupled with those human studies enhances the plausibility of a causality relationship between NCK2 and drug addiction.
Importantly, about one-fifth of opiates addiction subjects in the African-origin men carried minor allele G of SNP rs2377339, which is more than 10-fold of the frequency in the nonopiates dependence group. This suggested that the minor allele G in SNP rs2377339 potentially elevates the risk for opiates dependence in African-origin men. We acknowledge that our analysis included only 44 African-origin men with opiates dependence. Therefore, it is important and necessary to validate our finding through independent and larger cohort studies. Specifically, there are two possible strategies to validate our finding. The direct approach is to replicate the association between SNP rs2377339 and opiates dependence in a larger cohort. An indirect approach is to evaluate whether SNP rs2377339 is associated with any substance dependence (opiates, alcohol, marijuana, etc.) as presented in Table 2.
A distinction of our analysis is to consider simultaneously multiple substance addictions rather than a single substance. This approach, which is a realistic depiction of substance dependence, confirmed that a novel susceptibility gene, NCK2 is significantly associated with substance dependence in African-origin men.
This study has several limitations. First, we stratified by ethnicity and sex, which reduced sample sizes and affected the power of our analysis. Nonetheless, the significant associations revealed in African-origin men are consistent with the notion that men may be socially more prone to environmental influences that promote substance use and thus more vulnerable to addiction . Second, for SNP rs2377339, we observed heterogeneous genetic effects, suggesting interactions between race, sex, and the gene, because the association is much weakened after adjusting for race and gender. Such interactions have been suggested in other addiction research [44, 45, 47]. Again, our result further supports the importance to examine interactions among genes, race, and sex in addiction.
Conflict of Interests
The authors declare that they have no conflict of interests.
Zhifa Liu and Xiaobo Guo contributed equally.
This work was supported by Grant R01 DA016750-09 from the National Institute on Drug Abuse. Funding support for the Study of Addiction: Genetics and Environment (SAGE) was provided through the NIH Genes, Environment and Health Initiative (GEI) (U01 HG004422). SAGE is one of the genome-wide association studies funded as part of the Gene Environment Association Studies (GENEVA) under GEI. Assistance with phenotype harmonization and genotype cleaning, as well as with general study coordination, was provided by the GENEVA Coordinating Center (U01 HG004446). Assistance with data cleaning was provided by the National Center for Biotechnology Information. Support for collection of datasets and samples was provided by the Collaborative Study on the Genetics of Alcoholism (COGA; U10 AA008401), the Collaborative Genetic Study of Nicotine Dependence (COGEND; P01 CA089392), and the Family Study of Cocaine Dependence (FSCD; R01 DA013423). Funding support for genotyping, which was performed at the Johns Hopkins University Center for Inherited Disease Research, was provided by the NIH GEI (U01HG004438), the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, and the NIH Contract “High Throughput Genotyping for Studying the Genetic Contributions to Human Disease” (HHSN268200782096C). The datasets used for the analyses described in this paper were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000092.v1.p1 through dbGaP Accession no. phs000092.v1.p.
W. R. True, A. C. Heath, J. F. Scherrer et al., “Interrelationship of genetic and environmental influences on conduct disorder and alcohol and marijuana dependence symptoms,” American Journal of Medical Genetics, vol. 88, no. 4, pp. 391–397, 1999.View at: Google Scholar
G. E. Uhl and I. Gregory, “Genetic influences in drug abuse,” in Psychopharmacology: The Fourth Generation of Progress, pp. 1793–1806, 1995.View at: Google Scholar
The World Health Organization, “Substitution maintenance therapy in the management of opioid dependence and HIV/AIDS prevention,” Annual Report World Health Organization, 2004.View at: Google Scholar
J. Clarimon, R. R. Gray, L. N. Williams et al., “Linkage disequilibrium and association analysis of α-synuclein and alcohol and drug dependence in two American Indian populations,” Alcoholism: Clinical and Experimental Research, vol. 31, no. 4, pp. 546–554, 2007.View at: Publisher Site | Google Scholar
J. Gelernter, R. Gueorguieva, H. R. Kranzler et al., “Opioid receptor gene (OPRM1, OPRK1, and OPRD1) variants and response to naltrexone treatment for alcohol dependence: results from the VA Cooperative Study,” Alcoholism: Clinical and Experimental Research, vol. 31, no. 4, pp. 555–563, 2007.View at: Publisher Site | Google Scholar
T. Reich, H. J. Edenberg, A. Goate et al. et al., “Genome-wide search for genes affecting the risk for alcohol dependence,” American Journal of Medical Genetics, vol. 81, no. 3, pp. 207–215, 1998.View at: Google Scholar
J. Song, D. L. Koller, T. Foroud et al., “Association of GABAA receptors and alcohol dependence and the effects of genetic imprinting,” American Journal of Medical Genetics, vol. 117, no. 1, pp. 39–45, 2003.View at: Google Scholar
K. S. Wang, X. F. Liu, Q. Y. Zhang, Y. Pan, N. Aragam, and M. Zeng, “A meta-analysis of two genome-wide association studies identifies 3 new loci for alcohol dependence,” Journal of Psychiatric Research, vol. 45, no. 11, pp. 1419–1425, 2011.View at: Google Scholar
G. Kalsi, P. H. Kuo, F. Aliev et al., “A systematic gene-based screen of chr4q22–q32 identifies association of a novel susceptibility gene, DKK2, with the quantitative trait of alcohol dependence symptom counts,” Human Molecular Genetics, vol. 19, no. 12, pp. 2497–2506, 2010.View at: Publisher Site | Google Scholar
G. Kalsi, P. H. Kuo, F. Aliev et al., “A systematic gene-based screen of chr4q22–q32 identifies association of a novel susceptibility gene, DKK2, with the quantitative trait of alcohol dependence symptom counts,” Human Molecular Genetics, vol. 19, no. 20, pp. 4121–2506, 2010.View at: Publisher Site | Google Scholar
P. H. Kuo, G. Kalsi, C. A. Prescott et al., “Association of ADH and ALDH genes with alcohol dependence in the Irish affected sib pair study of alcohol dependence (IASPSAD) sample,” Alcoholism: Clinical and Experimental Research, vol. 32, no. 5, pp. 785–795, 2008.View at: Publisher Site | Google Scholar
L. J. Zuo, J. Gelernter, C. K. Zhang et al. et al., “Genome-wide association study of alcohol dependence implicates IAA0040 on chromosome 1q,” Neuropsychopharmacology, vol. 37, no. 2, pp. 557–566, 2012.View at: Google Scholar
N. L. Saccone, S. F. Saccone, A. L. Hinrichs et al., “Multiple distinct risk loci for nicotine dependence identified by dense coverage of the complete family of nicotinic receptor subunit (CHRN) genes,” American Journal of Medical Genetics B, vol. 150, no. 4, pp. 453–466, 2009.View at: Publisher Site | Google Scholar
I. Deb, J. Chakraborty, P. K. Gangopadhyay, S. R. Choudhury, and S. Das, “Single-nucleotide polymorphism (A118G) in exon 1 of OPRM1 gene causes alteration in downstream signaling by mu-opioid receptor and may contribute to the genetic risk for addiction,” Journal of Neurochemistry, vol. 112, no. 2, pp. 486–496, 2010.View at: Publisher Site | Google Scholar
D. Proudnikov, K. S. LaForge, H. Hofflich et al., “Association analysis of polymorphisms in serotonin 1B receptor (HTR1B) gene with heroin addiction: a comparison of molecular and statistically estimated haplotypes,” Pharmacogenetics and Genomics, vol. 16, no. 1, pp. 25–36, 2006.View at: Google Scholar
G. Gerra, L. Garofano, G. Santoro et al., “Association between low-activity serotonin transporter genotype and heroin dependence: behavioral and personality correlates,” American Journal of Medical Genetics, vol. 126, no. 1, pp. 37–42, 2004.View at: Google Scholar
Z. Luo, G. F. Alvarado, D. K. Hatsukami, E. O. Johnson, L. J. Bierut, and N. Breslau, “Race differences in nicotine dependence in the Collaborative Genetic study of Nicotine Dependence (COGEND),” Nicotine and Tobacco Research, vol. 10, no. 7, pp. 1223–1230, 2008.View at: Publisher Site | Google Scholar
N. Breslau, E. O. Johnson, E. Hiripi, and R. Kessler, “Nicotine dependence in the United States: prevalence, trends, and smoking persistence,” Archives of General Psychiatry, vol. 58, no. 9, pp. 810–816, 2001.View at: Google Scholar
G. M. Rivera, S. Antoku, S. Gelkop et al., “Requirement of Nck adaptors for actin dynamics and cell migration stimulated by platelet-derived growth factor B,” Proceedings of the National Academy of Sciences of the United States of America, vol. 103, no. 25, pp. 9536–9541, 2006.View at: Publisher Site | Google Scholar
H. Begleiter, T. Reich, V. Hesselbrock et al., “The collaborative study on the genetics of alcoholism,” Alcohol Health & Research World, vol. 19, no. 3, pp. 228–236, 1995.View at: Google Scholar
A. Christoforou, M. Dondrup, M. Mattingsdal et al. et al., “Linkage-disequilibrium-based binning affects the interpretation of GWASs,” American Journal of Human Genetics, vol. 90, no. 4, pp. 727–733, 2012.View at: Google Scholar
D. Rabinowitz and N. Laird, “A unified approach to adjusting association tests for population admixture with arbitrary pedigree structure and arbitrary missing marker information,” Human Heredity, vol. 50, no. 4, pp. 211–223, 2000.View at: Google Scholar