BioMed Research International

BioMed Research International / 2014 / Article
Special Issue

Pharmacogenomics in Personalized Medicine and Drug Metabolism

View this Special Issue

Research Article | Open Access

Volume 2014 |Article ID 897653 |

Yu-Huei Cheng, Li-Yeh Chuang, Hsueh-Wei Chang, Cheng-Hong Yang, "Improved Candidate Drug Mining for Alzheimer’s Disease", BioMed Research International, vol. 2014, Article ID 897653, 8 pages, 2014.

Improved Candidate Drug Mining for Alzheimer’s Disease

Academic Editor: Wei Chiao Chang
Received25 Dec 2013
Accepted19 Jan 2014
Published27 Feb 2014


Alzheimer's disease (AD) is the main cause of dementia for older people. Although several antidementia drugs such as donepezil, rivastigmine, galantamine, and memantine have been developed, the effectiveness of AD drug therapy is still far from satisfactory. Recently, the single nucleotide polymorphisms (SNPs) have been chosen as one of the personalized medicine markers. Many pharmacogenomics databases have been developed to provide comprehensive information by associating SNPs with drug responses, disease incidence, and genes that are critical in choosing personalized therapy. However, we found that some information from different sets of pharmacogenomics databases is not sufficient and this may limit the potential functions for pharmacogenomics. To address this problem, we used approximate string matching method and data mining approach to improve the searching of pharmacogenomics database. After computation, we can successfully identify more genes linked to AD and AD-related drugs than previous online searching. These improvements may help to improve the pharmacogenomics of AD for personalized medicine.

1. Introduction

Alzheimer’s disease (AD), the most common form of dementia, was first reported in 1906 [1]. In 2006, there were about 26.6 million AD patients worldwide and it was also common in southern Taiwan [2]. Although AD has been identified for a long time, most research progress was made in the recent 30 years [3]. However, no definitive cure is available for this disease and eventually it leads to death. Therefore, the drug discovery for Alzheimer’s disease remains challenging.

Single nucleotide polymorphisms (SNPs) are the most common variation in human genomes [4]. The importance of SNPs has been reviewed in genome-wide association studies for its association with disease susceptibility and drug metabolism [5, 6]. About 60–90% of the individual variation of drug response depends on pharmacogenomic factors. Therefore, SNP genotyping for candidate genes, pharmacological research, and drug discovery may play an increasingly important role in AD treatment. Meanwhile, increasing amounts of related information require the assistance of bioinformatics to construct the suitable databases and web servers.

Recently, PharmGKB (the Pharmacogenetics and Pharmacogenomics Knowledge Base) has been constructed to provide a comprehensive database for pharmacogenomic studies [7]. PharmGKB provides the pharmacogenetics research network in terms of SNP discovery and drug responses [8] with the fully curated knowledge for drug pathways, drug-related genes, and relationships among genes, drugs, and diseases. However, some information of different functions of PharmGKB is insufficient to allow convenient crosstalking between each other.

To solve this problem, we propose data mining method to improve the searching of pharmacogenomics of AD based on the download dataset of the PharmGKB resource.

2. Materials and Methods

The flowchart for pharmacogenomics in AD for personalized drug studies is shown in Figure 1. First of all, the AD-related drugs and genes are retrieved from PharmGKB download data using approximate string matching method and data mining approach. The genes associated with AD and the genes associated with a single Alzheimer’s drug are identified and compared with the online searching of PharmGKB. Then, numerous SNPs of genes associated with AD are identified. Through some SNP genotyping tools or assays, the association studies to AD-related drugs may be evaluated. Finally, the relevant information may be helpful for the personalized drug research.

2.1. AD-Related Drugs Using Approximate String Matching Based on PharmGKB Download Data

In order to study the pharmacogenomics of AD, we downloaded the known PharmGKB (the Pharmacogenetics and Pharmacogenomics Knowledge Base) ( [9, 10] as source by the approximate string matching method [11] to find out all AD-related drug classes. The meaningful keywords associated with “Alzheimer’s disease” are shown in Table 1. Then, these found drug classes are used to find out associated genes by data mining approach. The description of the approximate string matching method for all AD-related drug classes gives a pattern string , that is, the meaningful keywords associated with “Alzheimer’s disease” and a text string , that is, the description for drug and disease retrieved from PharmGKB. Find a substring in that has the smallest edit distance [12] to the pattern . The pseudocode for the edit distance is shown in Algorithm 1.


2Alzheimer's disease
3AD—Alzheimer's disease
4Acute Confusional Senile Dementia
5Alzheimer Dementia, Presenile
6Alzheimer Disease, Early Onset
7Alzheimer Disease, Late Onset
8Alzheimer Type Dementia
9Alzheimer Type Senile Dementia
10Alzheimer's Disease, Focal Onset
11Alzheimer's disease, NOS
12Dementia, Alzheimer Type
13Dementia, Presenile
14Dementia, Presenile Alzheimer
15Dementia, Primary Senile Degenerative
16Dementia, Senile
17Dementias, Presenile
18Dementias, Senile
19Disease, Alzheimer
20Disease, Alzheimer's
21Early Onset Alzheimer Disease
22Focal Onset Alzheimer's Disease
23Late Onset Alzheimer Disease
24Presenile Alzheimer Dementia
25Presenile Dementia
26Presenile Dementias
27Primary Senile Degerative Dementia
28Senile Dementia
29Senile Dementia, Acute Confusional
30Senile Dementia, Alzheimer Type
31Senile Dementias
32MeSH: D000544 (Alzheimer Disease)
33MedDRA: 10001896 (Alzheimer's disease)
34NDFRT: N0000000363 (Alzheimer Disease [Disease/Finding])
35SnoMedCT: 26929004 (Alzheimer's disease)
36UMLS: C0002395 (C0002395)

Drug class is one of the functions listed in the ParamGKB download data.

(1)// initialization
(2)for to do
(4)end for
(5)for to do
(7)end for
(8)// edit distance
(9)for to do
(10) for to do
(11)  if( ( ) =  ( )) then
(13)  else
(14)   min   MIN[ ]
(15)     min + 1
(16)  end if
(17) end for
(18)end for

2.2. Data Mining Method for PharmGKB Download Data

In this study, we used a priori algorithm [13] for frequent item set mining and association rule learning over PharmGKB. The pseudocode for the a priori algorithm for data mining in PharmGKB is shown in Algorithm 2. At first, a priori algorithm has to find out the frequent gene in drug class for “Alzheimer’s disease.” A set of genes can be mined from each drug class. A priori algorithm is a “bottom up” approach, where frequent gene subsets are extended one item at a time (i.e., candidate generation) and groups of candidates are tested against the data. This algorithm is terminated when no further successful extensions are found.

(1)Apriori(PharmGKB, )
(2)  (frequent genes in drug class for Alzheimer’s disease)
(6) for each drug class PharmGKB
(8)  for each candidate gene
(9)   count gene  count gene  + 1
(10)   end for
(11) end for
(14)end while

2.3. SNP Searching for Genes Using the NCBI dbSNP

Every gene contains numerous SNPs. In order to find out SNPs of single gene for Alzheimer’s pharmacogenomics, NCBI dbSNP ( is used to search in the study.

3. Results and Discussion

3.1. AD Information Based on PharmGKB Search

In PharmGKB online searching, the SNP variants, related genes, and drugs for AD are able to be retrieved. For example, the SNP information such as rs2066853 and rs6313 is provided (Figure 2). As shown in Figure 3, the AD-related genes such as ADRB1, AHR, HTR2A, MTHFR, and PTGS2 are identified and the related drugs such as olanzapine and risperidone are searched. This information may assist the researchers to study the pharmacogenomics of AD. Unfortunately, this PharmGKB online searching just provides limited information and it insufficiently copes with the complexity of the drug researches for Alzheimer’s personalized medicine.

3.2. PharmGKB-Based Data Mining of AD Information of Drug Classes or Gene Symbols

In current study, our proposed method is used to perform data mining for PharmGKB download data in terms of the keyword “Alzheimer’s disease.” As shown in Table 2, 22 kinds of AD-related drug classes are identified from “drug classes” of PharmGKB. Their corresponding PharmGKB accession ID, PubMed PMID, and the number of genes that are associated with AD-related drug classes are also presented. In total, 495 genes are identified for AD information of drug classes (see Supplementary file 1: gene information includes PharmGKB Accession Id, gene symbol, and publications are providing in different classes; it is available online at Alternatively, 99 genes associated with AD are identified from “gene symbols” of PharmGKB in terms of the keyword “Alzheimer’s disease.” These results suggest that the same keyword, for example, Alzheimer’s disease, may identify different numbers of AD-associated genes between “drug classes” or “gene symbols” of PharmGKB.

No.PharmGKB accession IDDrug classesPublications*1Gene no.*2

1PA164712423AnticholinesterasesPMID: 20644562  20644562  146747896
2PA164712308Ace inhibitors, plainPMID: 1736284124
3PA449515EtanerceptPMID: 1902787512
4PA451262RivastigminePMID: 20644562  16323253  17082448
20644562  15289797  17522596
5PA450243LithiumPMID: 1708244813
6PA10384Anti-inflammatory and antirheumatic products, nonsteroidsPMID: 17082448  1708244811
7PA449760Glatiramer acetatePMID: 170824484
8PA133950441Hmg coa reductase inhibitorsPMID: 1708244839
9PA151958596CurcuminPMID: 170824482
10PA451898Vitamin cPMID: 1708244816
11PA451900Vitamin ePMID: 170824481
12PA452229AntidepressantsPMID: 1708244843
13PA452233AntipsychoticsPMID: 1708244846
14PA449726GalantaminePMID: 20644562  16323253  17082448
15853556  20644562  14674789  12177686
15PA10364MemantinePMID: 170824480
16PA451283RosiglitazonePMID: 1677034134
17PA448031AcetylcholinePMID: 156951608
18PA450626NicotinePMID: 1569516088
19PA137179528NimesulidePMID: 16331303  118101823
20PA449394DonepezilPMID: 20859244  20644562  16323253
16424819  17082448  20644562  1973817012142731
21PA451576TacrinePMID: 9521254  17082448  10801254
9777427  18004213
22PA448976CholinePMID: 8618881122

PMID: PubMed article ID number.
*2The full gene names for each of the “drug classes” have been provided in the Supplementary file 1.

After detailed examination, 67 genes in the gene symbols searching (bold fonts of gene names as shown in Table 3) are absent from the genes in the drug class searching (Table 2). Furthermore, genes corresponding to the drug “memantine” listed in Table 2 (drug classes) are not found in Table 3 (gene symbols). Therefore, some current drugs have identified a small number of AD-related genes in the drug class searching; however, the remaining AD-related genes that may affect AD-related drugs may be partly discovered in the gene symbols searching. These novelly identified AD-related genes may be the potential candidates for further drug development of AD. These results demonstrated that our proposed data mining method may be an improved AD pharmacogenomics study.

No.PharmGKB accession IDGene symbols*SNP no.No.PharmGKB accession IDGene symbols*SNP no.No.PharmGKB accession IDGene symbols*SNP no.


Gene names in bold fonts are not identified in Table 2.
3.3. SNP Information of AD-Related Genes

The SNP statuses for 99 AD-related genes are also provided in Table 3. This SNP status for each gene is calculated from the online NCBI dbSNP queries. In general, many SNPs are found in these AD-related genes. Some SNPs of these genes have been reported to be associated with AD. For example, the APOE gene is found in Table 3 and one of its SNPs, such as ApoE epsilon 4 allele, has been reported to be associated with AD [14]. With suitable tools for SNP genotyping, these SNP candidates are warranted for the pharmacogenomics research of AD.

Currently, there are many high throughput SNP genotyping methods developed (as shown in Figure 1), including PCR resequencing [15], TaqMan probes [16], SNP microarrays [17], Matrix Assisted Laser Desorption/Ionization-Time of Flight (MALDI-TOF) [18], and others [19, 20]. Furthermore, some SNP genotyping tools or databases are also developed, such as SNP-RFLPing2 for comprehensive PCR-RFLP information based on SNPs [2124], algorithmic PCR-RFLP primer design and restriction enzymes for SNP genotyping [25, 26], and primer design for PCR-confronting two-pair primers (PCR-CTPP) [27, 28]. These tools and methods can provide useful and convenient information for SNP genotyping in the AD pharmacogenomics studies.

4. Conclusions

AD is the most common form of dementia for older people. The pharmacogenomics of AD still remains a challenge. In this study, we propose the pharmGKB-based data mining method to improve the gene discoveries for the potential AD-related drug candidates. With the assistance of bioinformatics, this improvement can help researchers to develop personal therapeutic drugs of AD.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


This work is partly supported by the National Science Council (NSC) in Taiwan under Grant nos. NSC101-2622-E151-027-CC3, NSC101-2221-E-464-001, NSC101-2320-B-037-049, NSC102-2221-E151-024-MY3, NSC102-2221-E214-039, and NSC102-2221-E-464-004, by the National Sun Yat-Sen University-KMU Joint Research Project (no. NSYSU-KMU 103-p014), and by the Ministry of Health and Welfare, Taiwan (MOHW103-TD-B-111-05).

Supplementary Materials

Gene information includes PharmGKB Accession Id, gene symbol, and publications are providing in different classes.

  1. Supplementary Tables


  1. N. C. Berchtold and C. W. Cotman, “Evolution in the conceptualization of dementia and Alzheimer's disease: Greco-Roman period to the 1960s,” Neurobiology of Aging, vol. 19, no. 3, pp. 173–189, 1998. View at: Publisher Site | Google Scholar
  2. M.-Y. Shiau, L. Yu, H.-S. Yuan, J.-H. Lin, and C.-K. Liu, “Functional performance of Alzheimer's disease and vascular dementia in southern Taiwan,” The Kaohsiung Journal of Medical Sciences, vol. 22, no. 9, pp. 437–446, 2006. View at: Publisher Site | Google Scholar
  3. W. Thies and L. Bleiler, “2013 Alzheimer's disease facts and figures,” Alzheimer's & Dementia, vol. 9, no. 2, pp. 208–245, 2013. View at: Publisher Site | Google Scholar
  4. L. Kruglyak and D. A. Nickerson, “Variation is the spice of life,” Nature Genetics, vol. 27, no. 3, pp. 234–236, 2001. View at: Publisher Site | Google Scholar
  5. J. Voisey and C. P. Morris, “SNP technologies for drug discovery: a current review,” Current Drug Discovery Technologies, vol. 5, no. 3, pp. 230–235, 2008. View at: Publisher Site | Google Scholar
  6. H. W. Chang, L. Y. Chuang, M. T. Tsai, and C. H. Yang, “The importance of integrating SNP and cheminformatics resources to pharmacogenomics,” Current Drug Metabolism, vol. 13, no. 7, pp. 991–999, 2012. View at: Publisher Site | Google Scholar
  7. K. Sangkuhl, D. S. Berlin, R. B. Altman, and T. E. Klein, “PharmGKB: understanding the effects of individual genetic variants,” Drug Metabolism Reviews, vol. 40, no. 4, pp. 539–551, 2008. View at: Publisher Site | Google Scholar
  8. K. M. Giacomini, C. M. Brett, R. B. Altman et al., “The pharmacogenetics research network: from SNP discovery to clinical drug response,” Clinical Pharmacology & Therapeutics, vol. 81, no. 3, pp. 328–345, 2007. View at: Publisher Site | Google Scholar
  9. T. E. Klein, J. T. Chang, M. K. Cho et al., “Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base,” The Pharmacogenomics Journal, vol. 1, no. 3, pp. 167–170, 2001. View at: Publisher Site | Google Scholar
  10. L. Gong, R. P. Owen, W. Gor, R. B. Altman, and T. E. Klein, “PharmGKB: an integrated resource of pharmacogenomic data and knowledge,” Current Protocols in Bioinformatics, vol. 23, pp. 14.7.1–14.7.17, 2008. View at: Publisher Site | Google Scholar
  11. G. Navarro, “A guided tour to approximate string matching,” ACM Computing Surveys, vol. 33, no. 1, pp. 31–88, 2001. View at: Publisher Site | Google Scholar
  12. M. Gilleland, “Levenshtein distance, in three flavors,” Merriam Park Software, 2009, View at: Google Scholar
  13. R. Agrawal and R. Srikant, “Fast algorithms for mining association rules in large databases,” in Proceedings of the 20th International Conference on Very Large Data Bases (VLDB '94), pp. 487–499, Santiago, Chile, 1994. View at: Google Scholar
  14. Y. C. Yen, C. K. Liu, F. W. Lung, and M. Y. Chong, “Apolipoprotein E polymorphism and Alzheimer's disease,” The Kaohsiung Journal of Medical Sciences, vol. 17, no. 4, pp. 190–197, 2001. View at: Google Scholar
  15. J. Zhang, D. A. Wheeler, I. Yakub et al., “SNPdetector: a software tool for sensitive and accurate SNP detection,” PLoS Computational Biology, vol. 1, no. 5, article e53, 2005. View at: Publisher Site | Google Scholar
  16. P. Borgiani, C. Ciccacci, V. Forte et al., “CYP4F2 genetic variant (rs2108622) significantly contributes to warfarin dosing variability in the Italian population,” Pharmacogenomics, vol. 10, no. 2, pp. 261–266, 2009. View at: Publisher Site | Google Scholar
  17. S. Sõber, E. Org, K. Kepp et al., “Targeting 160 candidate genes for blood pressure regulation with a genome-wide genotyping array,” PLoS ONE, vol. 4, no. 6, Article ID e6034, 2009. View at: Publisher Site | Google Scholar
  18. T. J. Griffin and L. M. Smith, “Single-nucleotide polymorphism analysis by MALDI-TOF mass spectrometry,” Trends in Biotechnology, vol. 18, no. 2, pp. 77–84, 2000. View at: Publisher Site | Google Scholar
  19. P.-Y. Kwok, “SNP genotyping with fluorescence polarization detection,” Human Mutation, vol. 19, no. 4, pp. 315–323, 2002. View at: Publisher Site | Google Scholar
  20. M. Olivier, “The Invader assay for SNP genotyping,” Mutation Research, vol. 573, no. 1-2, pp. 103–110, 2005. View at: Publisher Site | Google Scholar
  21. M. Ota, H. Fukushima, J. K. Kulski, and H. Inoko, “Single nucleotide polymorphism detection by polymerase chain reaction-restriction fragment length polymorphism,” Nature protocols, vol. 2, no. 11, pp. 2857–2864, 2007. View at: Publisher Site | Google Scholar
  22. H.-W. Chang, C.-H. Yang, P.-L. Chang, Y.-H. Cheng, and L.-Y. Chuang, “SNP-RFLPing: restriction enzyme mining for SNPs in genomes,” BMC Genomics, vol. 7, article 30, 2006. View at: Publisher Site | Google Scholar
  23. L.-Y. Chuang, C.-H. Yang, K.-H. Tsui et al., “Restriction enzyme mining for SNPs in genomes,” Anticancer Research, vol. 28, no. 4, pp. 2001–2007, 2008. View at: Google Scholar
  24. H.-W. Chang, Y.-H. Cheng, L.-Y. Chuang, and C.-H. Yang, “SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping,” BMC Bioinformatics, vol. 11, article 173, 2010. View at: Publisher Site | Google Scholar
  25. C.-H. Yang, Y.-H. Cheng, C.-H. Yang, and L.-Y. Chuang, “Mutagenic primer design for mismatch PCR-RFLP SNP genotyping using a genetic algorithm,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 9, no. 3, pp. 837–845, 2012. View at: Publisher Site | Google Scholar
  26. L. Y. Chuang, Y. H. Cheng, C. H. Yang, and C. H. Yang, “Associate PCR-RFLP assay design with SNPs based on genetic algorithm in appropriate parameters estimation,” IEEE Transactions on NanoBioscience, vol. 12, no. 2, pp. 119–127, 2013. View at: Publisher Site | Google Scholar
  27. N. Hamajima, “PCR-CTPP: a new genotyping technique in the era of genetic epidemiology,” Expert Review of Molecular Diagnostics, vol. 1, no. 1, pp. 119–123, 2001. View at: Publisher Site | Google Scholar
  28. C.-H. Yang, Y.-H. Cheng, L.-Y. Chuang, and H.-W. Chang, “Confronting two-pair primer design for enzyme-free SNP genotyping based on a genetic algorithm,” BMC Bioinformatics, vol. 11, article 509, 2010. View at: Publisher Site | Google Scholar

Copyright © 2014 Yu-Huei Cheng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles