Abstract

Background/Objectives. To identify copy number variants (CNVs) which are associated with body mass index (BMI). Subjects/Methods. CNVs were identified using array comparative genomic hybridization (aCGH) on members of pedigrees ascertained through severely obese (BMI ≥ 35 kg/m2) sib pairs (86 pedigrees) and thin (BMI ≤ 23 kg/m2) probands (3 pedigrees). Association was inferred through pleiotropy of BMI with CNV intensity ratio. Results. A 77-kilobase CNV on chromosome 20q13.3, confirmed by real-time qPCR, exhibited deletions in the obese subjects and duplications in the thin subjects (). Further support for the presence of a deletion derived from inference by likelihood analysis of null alleles for SNPs residing in the region. Conclusions. One or more of 7 genes residing in a chromosome 20q13.3 CNV region appears to influence BMI. The strongest candidate is ARFRP1, which affects glucose metabolism in mice.

1. Introduction

A number of single nucleotide polymorphisms (SNPs) have been reported to be associated with obesity [1]. Variants in some of the same genes are possibly associated with thinness [2]. Nevertheless, all body mass index- (BMI-) associated SNPs in combination fall short of accounting for the heritability of BMI [1].

Copy number variants (CNVs) may explain additional heritability of BMI. By definition, CNVs are chromosomal regions with sizes of 1 kilobase (kb) to several megabases (Mb) present in variable numbers in different individuals [3]. The Database of Genomic Variants (http://dgv.tcag.ca/dgv/app/home) currently lists >6,000,000 CNVs.

Eighty-four regions have been reported to harbor CNVs that are associated with obesity or BMI [4], including African Americans [5, 6] and Chinese [7, 8]. However, for only 4 of the 84 regions association has been reported in multiple studies [4]. One of the 4 regions, a deletion on chromosome 1p31.1 [9], has also been identified through association with a SNP near NEGR1 [10, 11]. In 3 other regions, obesity cooccurring with developmental delay was associated with deletions on chromosomes 10q11.22 [7, 9], 16p12.3 [4, 12], and 16p11.2 [1317]. In addition, an enrichment of CNVs was observed in obese adults [18], severely obese children [19], and children with syndromic obesity [20].

Herein, we performed a genome-wide search for CNVs that associated with BMI in members of extended pedigrees selected either for severe obesity or for healthy thinness, thereby excluding individuals with eating disorders, the focus of most previous genetic studies of the lower end of the BMI range [21]. We studied extended pedigrees to increase our power to detect both common and rare CNVs associated with BMI. To identify associated CNVs, we tested for pleiotropy with BMI, thereby requiring both inheritance and correlation of any identified CNV.

2. Subjects and Methods

2.1. Subjects

Obese pedigrees were ascertained in Utah using the Health Family Tree, a high school-based family history program designed to teach genetics and disease prevention as part of a mandatory health class [22]. Students, with parental assistance, reported disease information and risk factors for their parents’ first-degree relatives. We identified 435 severely obese (BMI ≥ 35 kg/m2) sib pairs from the Trees, chose 107 sibships for expansion, and examined 3,333 pedigree members. Similarly, we identified 40 thin (BMI ≤ 23 kg/m2) probands who reported multiple thin relatives and examined 400 family members including 265 thin individuals. We also ascertained a set of 1,156 unrelated Roux-en-Y gastric bypass (GB) patients [23]. All subjects were Utah residents with European ancestry. Each project (severe obesity, thinness, and GB) was approved by the Institutional Review Board of the University of Utah and informed consent was obtained.

For every studied member of the pedigrees, height was measured to the nearest centimeter using a Harpenden anthropometer (Holtain, Ltd.). Weight was measured in a hospital gown with a Scaletronix scale (model 5100) (Scaletronix Corporation, Wheaton, IL), which has an 800-pound capacity and weighing accuracy of 0.1 kilograms. BMI was computed as weight divided by height squared.

2.2. Laboratory Methods
2.2.1. Array Comparative Genomic Hybridization

CNVs were identified using array comparative genomic hybridization (aCGH) on the NimbleGen platform. A Roche Nimblegen 3x720K genomics array was used for copy number detection. Roche Nimblegen Human CGH3x720K Whole-Genome Tiling array contains empirically tested 720,000 probes per array, which provided analysis of the whole human genome. In brief, genomic DNA was isolated from the leukocytes using QIAamp DNA Blood Kit (Qiagen, Valencia, CA) per the manufacturer’s instructions. Microarray analysis was performed following the array manufacturer’s protocol. The test and reference genomic DNA were separately labeled with different fluorescent dyes and cohybridized to the 720K whole-genome tiling array. Microarray slides were scanned using Roche Nimblegen MS 200 Microarray Scanner (Roche Nimblegen, Madison, WI) and data captured from the scanned microarray image were saved as TIF files. In the array experiment, each subject was hybridized to the corresponding pedigree control. Data were extracted from scanned images using NimbleScan 2.5 extraction software, which allowed for automated grid alignment, extraction, and generation of data files. Using NimbleScan software, the ratios of the probe signal intensities were calculated. A region was defined as a deletion if there were 5 or more consecutive probes <−0.3. A duplication was defined as 5 or more consecutive probes >0.3. Results on family members were reported as intensity ratio (LRR where R represents intensity) of each CNV compared to a pedigree-specific control. CNVs were also identified for each control and reported as intensity ratio compared to a pooled Promega sample of females (Promega Corp, Madison, WI) in order to eliminate CNVs shared by multiple family members that were also present in the control.

2.2.2. Real-Time qPCR

To validate statistically significant aCGH results, we used real-time qPCR to detect copy number gain or loss, thereby ensuring that the probes in a region were not hybridizing with a homologous region elsewhere in the genome or were false positive findings. Unique sequence primer pairs specific to the targeted regions and spanning regions that are proximal and distal to the breakpoints were designed and used in real-time qPCR reactions using the Duplex TaqMan CNV assay kits (custom TaqMan Copy Number Assay and TaqMan Copy Number Reference Assays which detect a single-copy gene in its respective reference genome assembly; Applied Biosystems, Foster City, CA). The assay consisted of two primers and a FAM-labeled probe. For quality control, each assay was run as a duplex TaqMan real-time PCR reaction, one containing a FAM dye-based assay for the targeted gene and a VIC dye-based assay for the reference gene RNase P. All assays were conducted according to the assay protocol (TaqMan Copy Number Assay protocol, Applied Biosystems, Foster City, CA) in an Optical 384-well plate. In brief, the PCR reaction mixture contained the following: 2x TaqMan Universal PCR Master Mix; 20x Copy Number Assay; 20x Copy Number Reference Assay; DNase-free water; and genomic DNA sample. Four replicates were used for each sample and the control sample was inserted randomly in each batch. PCR was performed in the Applied Biosystems 7900HT Fast Real-Time System. Real-time data was analyzed using CopyCaller software (Applied Biosystems, Foster City, CA). The number of copies of the target sequence in each test sample was determined by relative quantitation (RQ) using the comparative CT method. This method measured the CT difference (delta CT) between target and reference sequences and then compared the delta CT values of test samples to a calibrator sample(s) known to have two copies of the target sequence.

2.2.3. SNP Genotyping

To test for a deletion, which presents as a null allele in SNP genotypes, we used the LightScanner (BioFire Diagnostics, SLC, UT) to genotype 5 SNPs within the chromosome 20q13.3 CNV region.

2.3. Statistical Methods
2.3.1. Association Analysis

We tested association with BMI of each autosomal CNV region containing ≥ 30 individuals for which LRR > 0.3 or LRR < −0.3, a restriction adopted to assure sufficient power and protect against false positives. The analysis used LRR as a continuous variable because of the difficulty in, and reduced power of, assigning specific copy number genotypes from aCGH data [24]. LRR was assigned as 0 to all controls and to each subject for whom intensity equaled his/her pedigree control. We separately analyzed each range defined by a distinct start and/or endpoint, selecting the largest positive or smallest negative LRR among an individual’s multiple measurements for overlapping regions. Consequently, a single LRR measurement might contribute to multiple CNV regions, making the tests nonindependent.

We detected association as pleiotropy between LRR and BMI, thereby requiring that LRR is both inherited and also genetically correlated with BMI and correspondingly accounting for relatedness within the sample. Using likelihood analysis in jPAP [25, 26], we assumed normality of BMI and LRR with parameters, mean, standard deviation, and heritability of BMI and of LRR, and the genetic correlation (pleiotropy) and environmental correlation between BMI and LRR. values were obtained from a 1 df statistic computed as twice the natural logarithm of the ratio of the maximized likelihood with the genetic correlation estimated to the maximized likelihood with the genetic correlation set equal to zero. To account for multiple testing we made a Bonferroni correction for 3,988 tests, the number of independent tests estimated by eliminating any interval that overlapped with an interval for which more individuals had LRR > 0.3 or LRR < −0.3.

2.3.2. Deletion Analysis

Since a deletion presents as a null allele in SNP data, we tested for a nonzero frequency of a third allele in SNP genotypes. For each SNP, we defined two traits, one for each of the two alleles, and assigned each subject a phenotype for each trait corresponding to presence/absence of the respective allele in his/her genotype. Designating the null allele as 0 and the SNP alleles as 1 and 2, trait 1 penetrance was assigned as 98% for genotypes 0/1, 1/1, and 1/2 and 2% for genotypes 0/2 and 2/2. Correspondingly, trait 2 penetrance was assigned as 98% for genotypes 0/2, 2/2, and 1/2 and 2% for genotype 0/1 and 1/1. We thereby allowed a 2% genotyping error rate and assumed that deletion carriers were genotyped as homozygotes.

Applying a 3-allele bivariate model in jPAP [25, 26] to phenotypes corresponding to genotypes uncleaned of Mendelian errors, we estimated allele frequencies. Significance of the null allele frequency was obtained from a 50 : 50 mixture of 0 and 1 df distributions [27] assumed for twice the natural logarithm of the ratio of the likelihood maximized estimating the null allele frequency to the likelihood maximized with the null allele frequency set to zero. Genotype probabilities were computed using jPAP and mean BMI was obtained for individuals with >90% null allele carrier probability.

3. Results

We performed array comparative genomic hybridization (aCGH) on a sample of 882 subjects selected from the pedigrees. The majority of subjects (97%) were members of 86 pedigrees ascertained through severely obese sib pairs; the rest was members of 3 pedigrees ascertained through thin probands (Table 1). For each pedigree, we selected as a control one family member lacking the ascertainment trait: for 18 obese pedigrees, the control was thin, defined in our analysis as a BMI ≤ 23 kg/m2; for all other pedigrees, including the 3 thin pedigrees, the control was nonobese (23 kg/m2 < BMI ≤ 30 kg/m2).

Every subject harbored CNVs defined as LRR > 0.3 or LRR < −0.3; the number of CNVs per subject ranged from 3 to 589 with a mean of 142 (Table 2). Despite wide variation between subjects, the variation in the number of CNVs could not be attributed to either subject’s obesity status or pedigree type (Table 2).

We tested for association as pleiotropy between BMI and LRR in 84 CNV regions that were previously reported to be associated with obesity [4]. 37 of the 84 regions contained no CNVs in our sample. For another 36 regions, fewer than 30 subjects harbored CNVs. None of the remaining 11 regions showed association with BMI.

However, BMI showed significant pleiotropy with 3 distinct CNV regions: on chromosomes 11, 18, and 20 (Table 3). Negative genetic correlations estimated for all three regions corresponded to deletions in the obese subjects and duplications in the thin subjects. The significant CNVs ranged from 3 and 12 kb on chromosomes 18 and 11, respectively, to 77 kb on chromosome 20. While the regions on chromosomes 11 and 18 were intergenic, the region on chromosome 20 encompassed 7 genes (RTEL1, TNFRSF6B, ARFRP1, ZGPAT, LIME1, SLC2A4RG, and ZBTB46).

To confirm the aCGH-detected CNVs, we applied qPCR to selected subjects from the aCGH sample. For each of the chromosomes 11, 18, and 20 regions, we tested an obese sib pair with aCGH-detected deletions and a thin sib pair with aCGH-detected duplications, plus additional relatives of each sib pair. In the chromosome 20 region, both obese sibs exhibited 1 copy and both thin sibs exhibited 3 copies; furthermore, a third deletion/duplication was found among relatives of the obese/thin sib pair (Table 4).

In contrast, qPCR on chromosomes 11 and 18 identified 2 copies for all tested subjects. This implies that the findings were either off-target effects of the CNV probes or false positive effects from the aCGH methodology.

As further evidence of a deletion on chromosome 20, we genotyped regional SNPs on 3,217 subjects, including the unrelated gastric bypass patients and members of the severe obesity and thin pedigrees, regardless of inclusion within the CNV sample. We obtained significant null allele frequencies for 4 of 5 SNPs in the presumed deletion region (Table 5). The null allele frequencies, ranging from 0.7% to 1.4%, were undoubtedly underestimated since detection of null alleles required that a parent and offspring be homozygous for different alleles. For comparison, significance of a null allele was not obtained for any of 3 SNPs (rs981782, rs1288775, and rs7182723) with similar MAF (0.26 to 0.46) that reside on other chromosomes and were genotyped on the same sample. Null allele carriers for the chromosome 20 SNPs were obese, despite being based on very small numbers (Table 5).

4. Discussion

We inferred a BMI-associated CNV on chromosome 20q13.3. Deletions and duplications in the region, identified in our sample using aCGH and confirmed using qPCR, were previously reported for HapMap data with designation esv2758806 [28]. Our inference of null alleles for regional SNPs provided additional evidence for a regional deletion.

Association of BMI with the chromosome 20q13.3 CNV was inferred from pleiotropy, which required that LRR demonstrate inheritance within the families as well as genetic correlation with BMI; we therefore exploited the relatedness within the sample to increase power and protect against false positives. Our qPCR and SNP genotype results also supported association with BMI.

The 77 kb CNV region on chromosome 20q13.3 harbors 7 genes (RTEL1, TNFRSF6B, ARFRP1, ZGPAT, LIME1, SLC2A4RG, and ZBTB46). Of these, the best candidate for an effect on BMI is ARFRP1, a GTPase that regulates protein trafficking between intracellular organelles. GTPases are involved with signaling of G protein-coupled receptor pathways and were previously implicated for CNVs associated with obesity [19]. ARFRP1 participates in the control of lipid droplet and chylomicron formation [29] and has been shown to affect glucose metabolism in the mouse [30].

The chromosome 20q13.3 CNV exerts a dosage effect on BMI such that hemizygosity results in obesity and duplication results in thinness. A similar dosage effect reported for a CNV on chromosome 16p11.2 differs in the cooccurrence of developmental defects with obesity/thinness [16]. Dosage effects increase the power for CNV detection; the chromosome 20q13.3 CNV was undetectable upon excluding thin sample members and we detected no CNVs that affected exclusively obesity or exclusively thinness.

We limited testing to regions with CNVs detected in a minimum of 30 sample members to assure sufficient power. Subsequent testing of regions with 20–29 CNV-positive individuals using parametric analysis and with 10–19 CNV-positive individuals using nonparametric analysis (results not shown) revealed no additional BMI-associated CNVs. One explanation for the absence of rare CNVs is the corresponding absence in our sample of developmental or other syndromes that have been reported in conjunction with most rare obesity-associated CNVs.

Using linkage analysis on a subset of our obese pedigrees, we previously localized a susceptibility gene to chromosome 4p14, later identified as TBC1D1 [31], as well as yet unidentified susceptibility genes on chromosomes 4q34-35 and 20q13.1 [32]. The chromosome 20 linkage region falls 20 MB centromeric of the CNV detected herein and no CNV was detected on chromosome 4q; therefore, none of our reported linkage regions can be attributed to a CNV. Chromosome 20q13 has been identified through linkage analysis of BMI in other samples as well [3336] but is always more centromeric than the CNV reported herein.

Despite the inference of numerous obesity susceptibility regions across the genome, all confirmed variants in combination fail to fully account for the heritability of BMI [37]. Obesity gene discovery is complicated by complex inheritance that includes genetic and environmental interactions [38]. Although the contribution of the chromosome 20q13.3 CNV to inherited obesity awaits confirmation, the region contains a strong candidate in ARFRP1. In addition, future obesity investigations should consider a strength of our study: the increased power provided by the inclusion of thin subjects.

In summary, we detected a CNV on chromosome 20q13.3 that is hemizygous in the obese subjects and duplicated in the thin subjects. The region encompasses 7 genes of which ARFRP1 appears to be the best candidate.

Conflict of Interests

The authors have no conflict of interests to declare.

Authors’ Contribution

Sandra J. Hasstedt performed the statistical analysis. Yuanpei Xin performed the laboratory experiments. Rong Mao and Tracey Lewis designed and oversaw the laboratory experiments. Ted D. Adams collected data on subjects. Steven C. Hunt oversaw the project. All authors were involved in writing the paper and had final approval of the submitted version.

Acknowledgment

This research was supported by NIH DK082938.