Abstract

Phosphoinositide-dependent kinase-1 (PDK-1) is an important therapeutic target for the treatment of cancer. In order to identify the important chemical features of PDK-1 inhibitors, a 3D QSAR pharmacophore model was developed based on 21 available PDK-1 inhibitors. The best pharmacophore model (Hypo1) exhibits all the important chemical features required for PDK-1 inhibitors. The correlation coefficient, root mean square deviation (RMSD), and cost difference were 0.96906, 1.0719, and 168.13, respectively, suggesting a good predictive ability of the model (Hypo1) among all the ten pharmacophore models that were analyzed. The best pharmacophore model (Hypo1) was further validated by Fisher’s randomization method (95%), test set method , and the decoy set with the goodness of fit (0.73). Further, this validated pharmacophore model Hypo1 was used as a 3D query to screen the molecules from databases like NCI database and Maybridge. The resultant hit compounds were subsequently subjected to filtration by Lipinski’s rule of five as well as the ADMET study. Docking study was done to refine the retrieved hits and as a result to reduce the rate of false positive. Best hits will further be subjected to in vitro study in future.

1. Introduction

Protein kinases are critical components of cellular signal transduction cascades [1]. Over 500 protein kinases in the human genome have been reported till date and they are considered as the second largest group of drug targets [2, 3]. Phosphoinositide-dependent kinase-1 (PDK-1), a 63 kDa serine/threonine kinase, is a major player in the PI3-kinase signaling pathway that regulates gene expression, cell cycle, growth, and proliferation [411]. PDK-1 is also termed as the ‘‘master kinase’’ because it phosphorylates highly conserved serine or threonine residues in the T-loop (or activation loop) of numerous AGC kinases, including PKB/AKT, PKC, p70S6K, SGK, and PDK-1 itself [12]. Although precise regulatory mechanisms vary, in the case of PKB/AKT, activation by PDK-1 is critically dependent on prior PI3 kinase activation and the presence of phosphatidylinositol-(3,4,5)-triphosphate (PIP3). A significant proportion (40–50%) of all tumors involve mutations in PIP3-3-phosphatase (PTEN) [1315], which result in elevated levels of PIP3 and enhanced activation of PKB/AKT, p70S6K, and SGK. The inhibitors of PDK-1 could potentially provide valuable therapeutic agents for the treatment of cancer.

Recognition process between ligand and model is based on spatial distribution of certain structural features of active site being complimentary to those of the interacting ligands, and the features common to the ligands would provide the information about the active site. A pharmacophore mapping is the essential step towards understanding of receptor-ligand recognition process and is established as one of the successful computational tools in rational drug design [16, 17]. This involves the identification of a three-dimensional arrangement of functional groups which a molecule must possess to be recognized by the receptor. Further, a model is generated by finding chemically important functional groups that are common to the molecules that bind. Pharmacophore can be derived by direct analysis of the structure of known ligand either in the most stable conformer or in the form observed when complexed with the target protein.

In the present study, a three-dimensional pharmacophore model for PDK-1 kinase inhibitors has been developed. The generated model is further utilized for screening of potentially active candidates from NCI and Maybridge database. The efficacy of these compounds is further validated by molecular docking.

2. Material and Methods

2.1. General Methodology

All pharmacophore models generation and Hypo1 based virtual screening were performed using the following tools.

HypoGen. It was implemented in Catalyst (Catalyst 4.1, Molecular Simulations Inc., San Diego, CA).

Fisher Randomization Test. It was done by CatScamble program implemented in Catalyst.

Lipinski Filtration. It was performed using Pipeline Pilot Studio (SciTegic Inc., San Diego, CA).

Ligandfit. Docking studies were achieved using Discovery Studio 2.5 (Accelrys Inc., San Diego, CA).

2.2. Data Set for Pharmacophore Analysis

A set of 83 different compounds were collected from different references [1821], which have been identified and reported to be inhibitors of PDK-1 kinase. The inhibitory activity of these compounds, expressed as IC50 (i.e., concentration of compound required to inhibit 50% of PDK-1 kinase activity), was studied for all compounds. The IC50 values spanned across a wide range from 3.0 to 65,000 nM. Amongst 83 compounds, 21 compounds were selected as training set compounds and the rest of compounds were taken as a test set compounds. The chemical structures of all training set compounds are shown in Figure 1. The selection of the training set and test set were according to the following rules: (i) there is structural diversity among molecules, (ii) both training set and test set cover a wide range of activity, and (iii) the highest active compounds were included in the training set because they provide critical information for pharmacophore generation. The geometry of all compounds was built by using Accelrys Discovery Studio 2.5 [22]. All the compounds were minimized using the steepest descent algorithm with a convergence gradient value of 0.001 kcal/mol and a family of representative conformations was generated by fast conformational analysis methods using poling minimize algorithm [23] and CHARMM force field parameters [24]. A large number of confirmations of each compound were generated within an energy threshold of 20.0 kcal/mol above the global energy minimum.

2.3. Pharmacophore Modeling

Based on the conformations for each compound, HypoGen module of Discovery Studio 2.5 was used to construct the possible pharmacophore models [25]. Instead of using the lowest energy conformation of each compound, all the conformational models of each compound in the training set were used in Discovery Studio 2.5 for pharmacophore hypothesis generations. The training set compounds (21 in number) associated with their conformations were submitted to Discovery Studio 2.5 for 3D QSAR pharmacophore Generation (HypoGen). The HypoGen module generated hypothesis with features common in active molecule and missing from inactive molecule.

2.4. Model Validation

The statistical parameters, such as the cost value, determine the significance of the model. The best model was selected on the basis of significant statistical parameters, like high correlation (), predicted the lowest total cost, and lower value of RMSD, and the value of the total cost should be closer to the fixed cost and far away from null cost. Another parameter, configuration cost, is also important for the determination of significance of the model, and it should be <17. The best hypothesis Hypo1 was also validated by test set validation method, Fischer's randomization validation, and decoy set method. Ligand pharmacophore mapping protocol was used for estimating the activity of all 62 test set compounds.

2.5. Decoy Set Validation

Results of test set validation method could only indicate that the generated pharmacophore model (Hypo1) has high efficiency in picking the active molecules but is not conformity as it also picked the inactive molecules. To further evaluate this, decoy set validation method was used to evaluate the efficiency of Hypo1 by calculating the GH (goodness of hit list) and EF (enrichment factor). A data set of small molecules was generated by decoy set finder 1.1 which included 1980 molecules with unknown activity and 20 active molecules were making a decoy set of 2,000 molecules. GH (goodness of hit list) and EF (enrichment factor) were calculated by the following equations: where number of molecules in hit list, active molecules present in the hit list, active molecules present in database, and molecules present in decoy set.

The range of GH score varies from 0 to 1. GH score 0 means a null model, while the GH score 1 means generation of an ideal model.

Although when the GH score is higher than 0.7, it reflect the generation of a very good model. The EF and GH were found to be 69.23 and 0.73 (shown in Table 1) indicating that the generated pharmacophore model had a rationale for virtual screening.

2.6. Virtual Screening and ADMET Analysis

The final validated hypothesis (Hypo1) was used as a 3D structural query for retrieving potent compounds from NCI database and Maybridge database having 23,8819 molecules and 2,000 molecules, respectively. A systematic diagram of virtual screening protocol is shown in Figure 2.

3. Results and Discussion

3.1. Pharmacophore Modeling

Ten hypotheses were produced by 3D QSAR pharmacophore generation module of Accerlys Discovery Studio 2.5 through 21 training sets compounds (Table 2). Hypo1 was the most significant hypothesis characterized by high cost difference (168.48433), the lowest root mean square deviation (), and the best correlation coefficient (). The fixed cost and the null cost values were 77.5618 and 258.686, respectively, with total cost value 90.2017 for Hypo1. This observation was much lower than null cost and closer to the fixed cost.

The best hypothesis (Hypo1) consists of four features, that is, two hydrogen bonds acceptor (HBA), one hydrogen bond donor (HBD), and one hydrophobic aliphatic feature (HyA). Figures 3(a) and 3(b) represent features of the best pharmacophore (Hypo1) and the distance and angular constraints in the best pharmacophore (Hypo1). The experimental and estimated activities of the best pharmacophore hypothesis (Hypo1) for 21 training set compounds are shown in (Table 3). Figure 4(a) represents the top scoring hypothesis Hypo1, mapped on the most active compound 1 ( nM) and Figure 4(b) represents the mapping of least active compound_21 ( nM) of the training set.

3.2. Cost Analysis

In addition to generating a hypothesis, HypoGen also provides two theoretical costs (represented in bit units) to help assess the validity of the hypothesis. The first is fixed cost (cost of an ideal hypothesis), which represents the simplest model that fits all data perfectly, and the second is the null cost (cost of null hypothesis), which represents the highest cost of a pharmacophore with no features and which estimates activity to be the average of the activity data of the training set molecules. They represent the upper and lower limits for the hypothesis that are generated. A meaningful pharmacophore hypothesis may be generated when the difference between null hypothesis and the fixed hypothesis is large; a value of 40–60 bits may indicate that it has 75–90% probability of correlating the data. Other two parameters that also determine the quality of any pharmacophore are configuration cost or entropy cost and error cost. The configuration cost depends on the complexity of the pharmacophore and should have value <17 whereas the error cost is dependent on the root mean square difference between the estimated and the actual activity of the training set. The difference between total fixed cost and the null cost of the Hypo1 was observed to be 168.4843, which is more than 40–60, which depicts more than 90% probability of data correlation. Noticeably, the total cost of Hypo1 was much closer to the fixed cost than to the null cost. Furthermore, a high correlation coefficient of 0.96906 was observed with RMS value of 1.0719 and the configuration cost of 15.4729, demonstrating the development of a reliable pharmacophore model with high predictivity.

3.3. Validation of Pharmacophore Model
3.3.1. Test Set Validation

The test set method is for examining whether the pharmacophore model is capable of predicting the activities of external compounds of the test set series. The test set contains 62 compounds structurally different from the training set molecules. All the test set molecules were prepared in the same way as that for the training set molecules. Test set validation was done using ligand pharmacophore mapping protocol. The test set of 62 compounds were mapped on the Hypo1. It was observed that pharmacophore model performed well in estimation of activity of test set compounds, with a significant predictive correlation value () between experimental and estimated activities (shown in Figure 5). The experimental and estimated activities of test set compounds mapped on the best hypothesis (Hypo1) were shown in (Table 4).

Further, another validation method was used to characterize the quality of the hypothesis using error ratio, which is the difference between estimated activity and experimental activity. Also an error ratio ≤10 depicts that there is no more than one order difference between estimated and experimental activity values, not more than one order. The best hypothesis (Hypo1) exhibited an error value ≤10 for 53 compounds out of 62 compounds. Only 9 compounds (compound_29, compound_32, compound_34, compound_38, compound_40, compound_51, compound_52, compound_53, and compound_55) with values > 10 were considered as outliers and rejected. The most potent compound_22 of the test set ( nM) was mapped with Hypo1 (Figure 6). The best hypothesis (Hypo1) mapped very well, also all the chemical features of this compound matched and the estimated activity of this compound had an IC50 value of 1.3 nM. Based on these results, it was confirmed that one HBD, two HBA, and one HyA (hydrophobic aliphatic) features are essential for PDK-1 inhibitory activity.

3.3.2. Fisher’s Validation

Fischer’s randomization test method was used to evaluate the statistical relevance of Hypo1 by using the CatScramble program. The confidence level was fixed at 95%. The CatScramble program generated 19 random spreadsheets to construct hypothesis using exactly the same conditions as used in generating the original pharmacophore hypothesis. Total cost of 19 pharmacophore hypothesis generated randomly and the original pharmacophore hypothesis are also presented in Figure 7. It is observed that an original hypothesis (Hypo1) was far more superior to the 19 random hypotheses, suggesting that Hypo1 is not generated by any chance event. These results have provided 95% confidence of the proposed hypothesis.

3.4. Pharmacophore Based Virtual Screening

The validated 3D QSAR pharmacophore model Hypo1 was used as a 3D structural query for retrieving potent compounds from NCI database and Maybridge database having 23,8819 molecules and 2,000 molecules, respectively. A total of 8,833 compounds exhibited good mapping with Hypo1 using fast and flexible search method. Out of total 8,833 compounds, 8,530 compounds were from NCI and 333 compounds were from Maybridge database. Out of these 8,833 molecules, 2033 molecules having their μM were selected for further studies. Further sorting of these hits has been done by Lipinski’s rule of five, to evaluate their drug similarity. Total 1,613 molecules passed this evaluative process. These 1,613 molecules were further evaluated for the ADMET studies. Only 842 molecules passed the ADMET filtration process. Those molecules were selected for further molecular docking studies, which exhibited estimated activity ≤0.5 μM. It was observed that only 43 molecules satisfied these conditions and further molecular docking study was prepared for these molecules.

3.5. Molecular Docking Studies

Further studies were conducted for selected compounds (retrieved hits) and evaluated the binding mode between compounds and protein. All the compounds and compound_1 were docked into the binding site of PDK-1 [26] (PDB entry: 1UU7) [27] by using LigandFit [28] docking method implemented in Discovery Studio 2.5 program package. Before docking all molecules, compound_1 (most active compound of the training set) was docked into the active site of PDK-1.

(a) Compound_1 Docking Description. Compound_1 has shown the docking energy of 64.5 kcal/mol and RMSD value of 0.841. This depicts that LigandFit docking method reproduced the original binding mode. Hence, for further docking of the LigandFit, docking method was used. It showed hydrogen bond interactions with important residues of amino acids, Lys111, Asp 230, Ala162, and Tyr 161 as shown in Figure 8(a).

(b) Other Compounds Docking Description. All 43 molecules (selected) were docked to the active sites of PDK-1 kinase, and, furthermore, only top 7 molecules, having high docking energy, better hydrogen bond interactions with active site residues, and lower estimated activity (≤0.19 μM) were selected. The estimated activity, interaction energy, and LignadFit scores of all seven compounds along with compounds_1 are listed in Table 5. Finally, the three compounds which were selected for further analysis are NSC_218341, NSC_24871, and NSC_21193. Compound NSC_211930, NSC_218341, and NSC_24871 mapped to all features of the Hypo1. Compound NSC_211930 formed the hydrogen bonding with Ala162 a hinge region amino acid. While the amide group formed the hydrogen bond with Asp223, Lys111 was involved in the cation-pi interaction. Compound NSC_24871 formed the hydrogen bond interaction with Lys111, Ser160, and Ala162. It was observed that the phenyl ring of compound was sandwiched between the phenyl rings of Tyr161 and Phe93 and they formed the pi-pi interaction. Tyr161 formed pi-pi interactions with phenyl ring of Compound_218342, while the carboxyl groups were involved in the formation of two hydrogen bonds with Lys111 and Phe94. Phenolic oxygen was involved in formation of hydrogen bond with Ser162 and Ala162 amino acids. In all the cases, Try 161 was involved in forming pi-pi interaction with the phenyl ring of the compounds. 2D representation of molecular docking results of all three compounds was shown in the Figures 8(b), 8(c), and 8(d). Lys111 formed two hydrogen bonds with the two different oxygen atoms of phenyl groups of the Compound NSC_24871. Also one phenolic oxygen atom formed the two hydrogen bonds with the two hinge regions in amino acids, that is, Ser160 and Ala162. These three compounds retrieved from two databases (NCI & Maybridge), exhibited good interactions with important amino acids in the active sites. Among all three compounds, Compound NSC_218342 retrieved from the NCI database was observed to exhibit good estimated activity, fit values and docking score, and hydrogen bond interactions. Molecular docking results support that these molecules can be further taken as the potential leads for designing novel PDK-1 inhibitors in the future.

4. Conclusions

A ligand based computational method was used to identify molecular structural features required for effective PDK-1 inhibitors for discovery of drugs to prevent and cure wide variety of cancers. A data set of 83 compounds of selective PDK-1 inhibitors with their respective activities ranging over a wide range of magnitude has been used to generate pharmacophore hypothesis and to predict the activity successfully and accurately. A highly predictive pharmacophore model was generated based on 21 training set molecules, which had hydrogen bond acceptor, hydrogen bond donor, and hydrophobic aliphatic groups as chemical features which described their activities towards PDK-1 kinase. The validation of the model was based on 62 test set molecules, which finally showed that the model was able to differentiate various classes of PDK-1 inhibitors with a high correlation coefficient of 0.87 between experimental and predicted activities accurately. Further validation of Hypo1 was done by a decoy set method. The decoy set method exhibited GH score of 0.73 which depicts that designed model has very high efficiency in screening the molecules from database. Hypo1 was used as a 3D query to screen the potential molecules from the NCI database as well as Maybridge database. The hit compounds were filtered subsequently by Lipinski’s rule of five and ADMET filtration. Further, molecule selection was refined by docking study. After the docking studies, it was observed that the 3 molecules (NSC_218342, NSC_24871, and NSC_211930) with different scaffolds exhibited better docking energy as well as better interaction. To conclude, the defined drug candidate is further evolved using in vitro and in vivo studies as anticancer molecule.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors thank Dr. J. A. R. P. Sarma, Senior Vice President, and Dr. K. V. Radhakishan, Ex-Director of GVK Biosciences Pvt., Ltd., for their cooperation and providing software facilities.