Abstract

The CYP2D6 enzyme is involved in the metabolism of many commonly prescribed drugs. The presence of polymorphisms in the CYP2D6 gene may modulate the enzyme level and activity affecting individual responses, to pharmacological treatment in drug level, response and adverse reactions. Aims. This study aimed to analyze the determination of allele frequencies in Sardinians and the comparison to frequencies found in the Caucasian Population. Methods and Materials. We used a Long PCR strategy coupled to direct genomic DNA sequencing analysis. An amplification allele-specific was carried out to infer the correct allelic phase. The TaqMan Gene Copy Number Assay (Applied Biosystems) was used to verify the presence of gene deletions/multiplications. Results and Conclusions. Our results indicated that CYP2D6 allele frequencies in Sardinians differed from those previously detected in the Caucasian Population. Moreover, three new SNPs and four novel haplotypes were identified.

1. Introduction

Cytochrome P450 2D6 (CYP2D6) is a highly polymorphic gene, which is responsible for the metabolism of several key endogenous substrates and other xenobiotics [1] and about 25% of the most commonly prescribed drugs [24] (Table 1). More than 130 Single Nucleotide Polymorphisms (SNPs) have been identified within the CYP2D6 gene, including numerous nonsynonymous variations, as well as silent, promoter, and intronic changes. The 82 allelic variants and numerous subvariants reported to date are summarized in the Human CYP Allele Nomenclature Committee website [5]. The presence of SNPs can alter CYP2D6 enzymatic activity with effects ranging considerably within a population and includes individuals with ultrarapid (UM), extensive (EM), intermediate (IM), and poor (PM) metabolizer status. Furthermore, rearrangements within the gene locus have created multiple functional gene copies or deleted the entire gene, resulting in increased or absent drug metabolism, respectively [6, 7]. Genotypic analysis to identify individual polymorphisms has become increasingly important during drug development and for selection of individualized therapies. This analysis aims to increase the number of responders and to limit the incidences of adverse events.

In our previous work [8], we tried to create a complete genotyping method for the simultaneous analysis of CYP3A4, CYP3A5, CYP2C9, CYP2C19, and CYP2D6 SNPs. The genotyping of the CYP2D6 gene was difficult due to its polymorphic nature, the presence of two flanking pseudogenes, and copy number variants. To avoid false genotyping, resulting from nonspecific coamplification of the highly homologous pseudogenes, the analysis of this gene was not included in our previous study.

For this reason, we aimed to create a primary CYP2D6 PCR strategy based on the amplification of the entire gene (6.572 Kb) coupled to direct genomic DNA sequencing analysis. This strategy avoids false genotyping, which would result in nonspecific coamplification of the homologous pseudogenes.

The choice of the population to be studied may be critical. Isolated populations, such as that found in Sardinia [9, 10], could be extremely useful for mapping novel SNPs and haplotypes for specific genes. There are very few studies on CYP450 subfamily genes in Sardinian people [8, 11, 12]; therefore, we analyzed the CYP2D6 gene for sequence variations in Sardinians and also compared resulting allele frequencies with those observed in Caucasian population.

2. Methods and Materials

2.1. DNA Samples

Genomic DNA extracted from blood samples of 250 healthy Sardinian individuals were furnished by Professor Francesco Cucca, INN-CNR Cagliari Director. All participating individuals provided informed consent to genetic test.

2.2. Long Primary PCR

Selective amplification of the CYP2D6 gene was carried out using a Long PCR protocol. In particular, a forward primer (P–1780, Table 2), was designed in a highly nonhomologous CYP2D6/CYP2D7P/CYP2D8P 5′ untranslated region (nucleotides –1780_–1758 according to the AY545216 sequence [13]). The reverse primer used was 2D6–R (Table 2), as previously described [14]. A 5′ 10-mer tag (5′-ACGTTGGATG-3′) was added to both PCR primers in order to improve PCR efficiency. PCR reactions were performed in a final volume of 50 μL using the QIAGEN (Hilden, Germany) LongRange PCR Kit protocol [15] with the following minor modifications: 100 ng genomic DNA, 500 μM of each PCR Primer (Metabion, Martinsried, Germany), 1 U QIAGEN LongRange PCR Enzyme, 1X QIAGEN LongRange PCR Buffer (containing MgCl2 25 mM) and 800 μM Invitrogen (CA, USA) 2′-deoxynucleoside-5′-triphosphate (dNTP) Set PCR Grade. The PCR conditions were as follows: initial denaturation at 93°C for 3 min; 40 cycles at 93°C for 30 s, 61°C for 30 s, and 68°C for 6 min.

2.3. CYP2D6 Sequencing Analysis

For sequencing reactions, we used primers designed by both the Assay Designer Suite [16] and Primer3 v.0.4.0 [17], and some others found in the public literature [14, 18] (Table 3). Forward and reverse primers were used to sequence both strands of the whole gene. The generated CYP2D6 6.572 Kb gene amplicon was submitted to inactivation of unincorporated dNTPs by adding 20 μL of ExoSAP–IT (USB, OH, USA) to each 50 μL aliquot of PCR product. Purification conditions were as follows: 37°C for 60 min and 85°C for 15 min. To each 2.5 μL aliquot of PCR/Purification, 7 μM of each PCR Sequencing Primer (Metabion), 1X BigDye Terminator v3.1 Ready Reaction mix (Applied Biosystems), 1.5X BigDye Terminator (Applied Biosystems) and sterilized H2Odd were added to reach a final volume of 10 μL. Thermal cycler conditions were as follows: 25 cycles at 95°C for 30 s, 50°C for 15 s, and 60°C for 4 min. For primers CYP2D6_01_F, 1863_1864insTTTCGCCCC and 2291G>A, the annealing conditions were modified as follows: 25 cycles at 95°C for 30 s, 60°C for 15 s, 60°C for 4 min. For primers P–1780 and 2D6–R [14], an initial denaturation was added as follows: initial denaturation at 95°C for 3 min; 25 cycles at 95°C for 30 s, 50°C for 15 s, and 60°C for 4 min. All amplified products were cleaned with EDTA 125 mM and ethanol, resuspended in 12 μL formamide and, before sequencing analysis, submitted to denaturation at 92°C for 2 min. DNA sequencing reactions were carried out using a 3730 48-Capillary DNA Analyzer (Applied Biosystems) and MicroAmp Optical 96 Well Plates (Applied Biosystems). Sequences deposited in GenBank (M33388 [19], AY545216 [13]) served as reference for CYP2D6.

2.4. Copy Number Variation (CNV) Analysis

The presence of CYP2D6 gene deletion (CYP2D6*5 allele) and duplications/multiplications was evaluated in all 250 Sardinian DNA samples using CYP2D6 TaqMan Gene Copy Number Assay (Applied Biosystems) [20]. Reactions were performed in a final volume of 10 μL containing 10 ng of genomic DNA, 2X TaqMan Universal PCR Master Mix, 1X TaqMan VIC–RNase P Assay, and 1X TaqMan FAM-labelled CYP2D6 gene specific assay Hs00010001_cn. Thermal cycler conditions were as follows: 95°C for 10 min followed by 40 cycles of 95°C for 15 s and 60°C for 60 s. Gene deletion or duplications/multiplications presence was evaluated using analysis software furnished with the TaqMan CNV Assay.

2.5. Haplotype Determination/Identification

We used the Maximum-Likelihood Estimation (MLE), based on the Expectation-Maximization (EM) algorithm [21, 22] to infer haplotype combinations from genotype information. MLE is a statistical tool used for fitting a statistical model to data and providing estimates for the model’s parameters. This method corresponds to many well-known estimation methods in statistics, particularly when cost or time are constraining. MLE assumes that data have a Gaussian distribution with an unknown mean and variance. The mean and variance can be estimated with MLE while only knowing the data of a sample from the overall samples. MLE accomplishes this by finding specific values for mean and variance that produce the distribution most likely to have produced the observed results. In this case, we applied the information included in the Human CYP Allele Nomenclature Committee [5] web-database to the program. We implemented a partial-match strategy to assign the most probable haplotype using a typical set of polymorphisms that defines each allelic variant. The best assignment of the most probable haplotype to each sample assumes the presence of Hardy-Weinberg equilibrium.

2.6. CYP2D6 Single Allele Analysis

To verify the reliability of the MLE/EM algorithm and to verify the presence of novel SNPs haplotype associations in samples presenting heterozygous status for –1584G>C SNP, it was possible to modify the Long Range PCR protocol and apply it to new sequencing analysis. We modified the primer −1584G>C (Table 3) and added to the primer a 3′-dNTP complementary to the wild-type (WT) nucleotide or to SNP (MUT) to create an amplification allele-specific. For each sample a double Long PCR reaction was carried out using P–1584_WT and P–1584_MUT (Table 2) as forward primers. The reverse primer was 2D6–R [14] for both PCR reactions. A 5′ 10–mer tag (5′-ACGTTGGATG-3′) was added to both PCR primers, creating a 6.379 Kb amplicon. In this way, we were able to directly determine the phase. PCR reactions were performed in a final volume of 50 μL using 100 ng genomic DNA, 400 μM of each PCR Primer (Metabion), 1 U QIAGEN LongRange PCR Enzyme, 1X QIAGEN LongRange PCR Buffer (containing MgCl2 25 mM) and 800 μM Invitrogen dNTP Set PCR Grade. The PCR conditions were as follows: initial denaturation at 93°C for 3 min; 12 cycles at 93°C for 30 s, 67°C for 30 s, 68°C for 3 min; 28 cycles at 93°C for 30 s, 65°C for 30 s, 68°C for 6 min. All sequencing forward and reverse reactions using the primers shown in Table 3 were carried out as indicated in “CYP2D6 Sequencing Analysis”. P–1780 and –1426C>T primers were not inserted in this single allele sequencing analysis.

3. Results and Discussion

A sample of 250 unrelated healthy Sardinian individuals was analyzed and their haplotype phases were defined. A CYP2D6 sequencing method was developed using a Long PCR strategy coupled with direct genomic sequencing analysis in order to avoid false genotyping, resulting from non-specific co-amplification of the highly homologous CYP2D7P and CYP2D8P pseudogenes. Analyses were performed by forward and reverse direct genomic DNA sequencing using ExoSAP-IT and BigDye Terminator v3.1 protocols. The CYP2D6 Applied Biosystems CNV Assay was used to detect the presence of duplications or multiplications in all sequenced samples. The presence of the CYP2D6 deletion (CYP2D6*5 allele) was evaluated using the CYP2D6 Applied Biosystems CNV Assay, as it could not be detected by sequencing analysis. We have used the MLE based on the EM algorithm [21, 22] to infer the corresponding allele variants from the samples’ genotypes. For some samples, it was not possible to infer any of the haplotypes indicated in the Human CYP Allele Nomenclature Committee website [5], due to the presence of new SNPs associations not described in the website. In fact, we found, in both the homozygous and heterozygous statuses, the presence of new haplotypes that could be identified as hybrids between CYP2D6*2 M and *41 alleles (“Sardinian haplotype 1,” SH1) and modification of the CYP2D6*2 M allele (“Sardinian haplotype 3”, SH3) (Table 4). Moreover, we found the presence of three novel SNPs (Table 4, Figure 1). In two samples presenting SH1, sequencing analysis revealed the presence of two novel SNPs in heterozygous status (“Sardinian haplotype 2”, SH2),-948C>A found in the 5′-UTR, and silent 3176C>T found in Exon 7. Again, in one sample presenting SH3, we found the presence of one novel SNP (“Sardinian haplotype 4”, SH4), 3948T>G found in Intron 8. To verify the reliability of the MLE/EM algorithm, the presence of these new haplotypes and novel SNPs, and to infer the correct phase, we decided to apply a new sequencing analysis. In samples presenting a heterozygous status for–1584G>C SNP, it was possible to modify the Long Range PCR protocol and apply it to new sequencing analysis and create an amplification allele-specific. Sequencing analysis using forward and reverse strands of genomic DNA confirmed the presence of new haplotypes. For novel SNPs, it was possible to infer the correct phase (Table 4, Figure 1). For known haplotypes, CYP2D6 Single Allele Analysis results confirmed statistical results obtained with the MLE/EM algorithm.

Allele frequencies, detected from the 250 Sardinian individuals (Table 5), were compared with frequencies previously reported for Caucasians [11, 12, 18, 2326] and reported in Table 6. Particularly, differences were found in CYP2D6*2 and *41 alleles. Readers have to consider that frequencies of these two alleles can be compared only with Raimundoet al. [23] and Sistonen et al. [11] published after the key SNP for CYP2D6*41 at position 2988G>A was described. The CYP2D6*2 frequency was similar to the results of Raimundo et al. (17.5%) [23], but in disagreement with previous study in Sardinians (35.7%) [11]. This difference was probably due to the limited number of subjects analyzed in [11]. Particularly, We found the presence of CYP2D6*2A, *2L, and *2 M suballeles only, where CYP2D6*2L (2.2%) and *2 M (1.6%) were previously described in a South African population [27, 28]. Moreover, we detected the presence of two new haplotypes, SH3 in 1.6% of the subjects, and SH4, presenting novel SNP 3948T>G, in 0.2% of the subjects (Table 4). We considered these haplotypes as variants of the CYP2D6*2 M suballele. On the contrary, CYP2D6*41 frequency was significantly higher than those previously described (17.8% v/s 3.6–8.4%) [11, 23]. We found the presence of two new haplotypes, SH1 in 8.2% of the subjects and SH2, presenting novel SNPs-948C>A and 3176C>T, in 0.4% of the subjects (Table 4). We considered these haplotypes to be hybrids between CYP2D6*2 M and *41 alleles, but included them in the CYP2D6*41 frequency analysis due to the presence of the key SNP 2988G>A, which causes a splicing defect. As for other alleles:(i)CYP2D6*1, *3 and *4 frequencies are similar to reference studies; only subvariant CYP2D6*1A, *1B, *1D*3B and *4A were found.(ii)CYP2D6*6, *9, and *35 frequencies are lower than previously described. Only sub-variant CYP2D6*6A was found. CYP2D6*9 frequency is similar to that found by Fuselli et al. (0.6%) [12], even though this study did not find CYP2D6*9 in Sardinians.(iii)For CYP2D6*10 and *15, significant high frequencies were detected as compared to Caucasians, but CYP2D6*10 was in agreement with previous studies of Sardinians [11, 12]. Only sub-variant CYP2D6*10B was found.(iv)CYP2D6*20 and *28 alleles were not detected in reference studies, but found in Sardinians with 0.2% and 0.8% frequencies, respectively.

Deletions and duplications/multiplications were analyzed using the Applied Biosystems CNV Assay. CYP2D6*5, the entire gene deletion, had a significantly lower frequency as compared to the Caucasian population (1% v/s 4–7%), but was in agreement with previous studies in Sardinians (1.0–1.8%) [11, 12]. CYP2D6*1xN, duplication of the CYP2D6*1 allele, was found with a frequency of 0.8%, in agreement only with Sachse et al. (0.5%) [18], but not with other reference works. CYP2D6*2xN, duplication of the CYP2D6*2 allele, has a frequency similar to those previously described (1.2% v/s 1–5%). No other kinds of allele duplications were found.

Sardinian genotype frequencies for the CYP2D6 gene were not deviated from expectations based on Hardy-Weinberg equilibrium at a significance level of 0.05 (χ2 test P–value = 0.091098) (Table 7). Predicted enzymatic activity of the 250 Sardinian individuals was different from that found in other studies (Table 8) [18, 23, 29]. EM and PM frequencies in the Sardinian subjects were lower than those previously described (29.6 v/s 37.3–56.7% and 4.4% v/s 7.2–15.9%, resp.). On the contrary, IM frequencies were higher than found in other studies (62.0% v/s 27.4–36.2). UM frequencies were similar in Sardinians and Caucasians (2.4% v/s 1.3–2.6%). Finally, for 1.6% of the Sardinian subjects presenting the CYP2D6*28 allele in the heterozygous status, we did not infer the predicted enzymatic activity.

4. Conclusions

Differences in drug responses could be due to genetic factors. Knowledge of individual genetic profiling is clinically important and provides benefits for future medical care by predicting the drug response or developing DNA based tests. Substantial interindividual variability in response to specific therapies might be caused by the presence of polymorphisms in genes encoding components of drug metabolism pathways, such as the CYP450 family genes. The CYP2D6 isoenzyme is involved in the metabolism of drugs such as antipsychotics, antidepressants, β blockers, and antiarrhythmics. Polymorphisms in this gene have been thoroughly investigated, including their associations with the incidence of adverse reactions.

In this study, we have developed a primary CYP2D6 PCR strategy based on the amplification of the entire gene coupled to direct genomic DNA sequencing analysis. Moreover, by modifying Long PCR Forward primer, we have implemented a rapid strategy to infer the phase by direct sequencing strategies.

This way, we have established the frequencies of most of the CYP2D6 Caucasian alleles in 250 Sardinian samples and found that there are some important differences to those reported in Caucasians. In our population, we found the presence of suballeles, CYP2D6*2L and *2 M, previously found in a South African population [27, 28] with significant frequencies (2.2% and 1.6%, resp.) and, moreover, the presence of one novel SNP, 3948T>G, and two new haplotypes, SH3 and SH4, that we categorized as variants of CYP2D6*2 M. Furthermore, we found two novel SNPs, –948C>A, and 3176C>T, and two new haplotypes, SH1 and SH2, that we categorized as a CYP2D6*2 M–*41 hybrid. Due to the presence of key SNP 2988G>A we classified SH1 and SH2 as sub-variants of CYP2D6*41 in the frequency analysis. Newly discovred SNPs were deposed on NCBI Single Nucleotide Polymorphism dbSNP [3032] and novel sequence haplotype were deposed on NCBI GenBank [3336]. Finally, there was a significant increase of genotypes correlated to predicted IM and PM phenotypes, a total of 66.4% of subjects v/s ~ 43% from reference studies [18, 29]. The differences found between our study and previous Sardinian analyses were probably due to the limited number of SNPs and subjects analyzed by Sistonen et al. [11] and Fuselli et al. [12]. These results should be considered for future medical care when a subject is Sardinian or of direct Sardinian descendent, particularly to avoid adverse drug reactions due to the great frequency of PM and IM subjects.

Conflict of Interests

The authors declare that there is no conflict of interests.

Acknowledgments

The authors gratefully acknowledge Professor Francesco Cucca, INN-CNR Cagliari Director, for gently furnishing Sardinian DNAs; Dr. Maristella Pitzalis and Dr. Francesca Deidda for technical assistance in DNA Sequencing; Dr. Luisella Saba, Dr. Elena Congeddu and Dr. Enrico Sorisio, PharmaNess Sole Director, for useful information provided. A special thanks goes to Professor Annalisa Marchi who has kindly allowed the completion of this paper in her laboratories.