Molecular Modeling of Antimalarial Agents by 3D-QSAR Study and Molecular Docking of Two Hybrids 4-Aminoquinoline-1,3,5-triazine and 4-Aminoquinoline-oxalamide Derivatives with the Receptor Protein in Its Both Wild and Mutant Types
Modeling studies using 3D-QSAR and molecular docking methods were performed on a set of 34 hybrids of 4-aminoquinoline derivatives previously studied as effective antimalarial agents of wild type and quadruple mutant Plasmodium falciparum dihydrofolate reductase (DHFR). So, the famous mathematical method multiple linear regression (MLR) was explored to build the QSAR model. The DFT-B3LYP method with the basis set 6-31G was used to calculate the quantum chemical descriptors, chosen to represent the electronic descriptors of molecular structures. On the contrary, the MM2 method was used to calculate lipophilic, geometrical, physicochemical, and steric descriptors. The QSAR model tested with artificial neural network (ANN) method shows high performance towards its predictability. The predicted model was confirmed by three validation methods: leave-one-out (LOO) cross validation, Y-randomization, and validation external. The molecular docking study of three compounds 9, 11, and 26 on both wild and quadruple mutant types of pf-DHFR-TS as the protein target helps to understand more and then predict the binding modes with the binding sites.
Malaria is one of the world’s greatest global public health challenges. It is most prevalent in sub-African, Asian, and South American countries, and it mostly affects children under the age of five and pregnant women [1, 2]. According to a world health organization (WHO) report 2015, estimated 3.2 billion people were at risk of malaria, approximately 212 million cases of malaria worldwide, and 429.000 deaths occurred worldwide in 2015. Of these estimated deaths, 90% occurred in sub-Saharan Africa . Malaria is an infectious and contagious disease caused by the protozoa of the genus Plasmodium . There are five species that infect humans (P. falciparum, P. vivax, P. malariae, P. ovale, and P. knowlesi) . However, amongst these five species, Plasmodium falciparum is the most severe and lethal species . Many efforts are made in attempts to find out efficient inhibitors for this protein by testing several molecular structures. The quinoline moiety has attracted a great consideration of the medicinal chemists, as it is one of the crucial pharmacophores accountable for imparting antimalarial action [6, 7]. On the contrary, the 1,3,5-triazine derivatives cycloguanil, chlorcycloguanil, clociguanil, and WR99210 are already approved as effective dihydrofolate reductase (DHFR), specific inhibitor of P. falciparum domain, and they selectively inhibit biochemical processes that are vital for parasite growth . Nowadays, to overcome drug resistance problems the concept of hybrid molecules has been introduced, in which two or more pharmacophores are linked together (as quinoline-triazine and qunoline-oxalamide), and it is believed that these compounds act by inhibiting simultaneously two conventional targets . In this study, we worked on these two pharmacophores, as two types of hybrids: 4-aminoquinoline-triazine and 4-aminoquinoline-oxalamide .
The discovery of new antimalarial drugs is very challenging; the aim of developing a QSAR model is to construct a relationship (using statistical methods) between structural properties and activities using a training set which is capable of predicting the activity of compounds which are not used to build the model by multiple linear regression (MLR) and artificial neural network (ANN) calculations. The QSAR model has been validated by using an internal and external validation as well as Y-randomization. To develop the binding modes of this set of hybrids in the active sites, we have to perform the docking of three compounds: on the one hand, the highest active compound 26 belonging to the triazine series and the highest active compound 9 from the oxalamide series and on the other hand, the lowest active of the entire series compound 11, with Plasmodium falciparum dihydrofolate reductase–thymidylate synthase (pf-DHFR-TS) in its two forms: the wild type and the quadruple mutant . This study allows the developing of models that not only provide details of the binding modes and key molecular interactions but also allow the prediction of relative inhibition and binding affinities that could be reproduced in silico.
2. Materials and Methods
2.1. Experimental Data
In this work, a data set of 34 hybrids of 4-aminoquinoline  constituting two groups is explored. The first group (4-aminoquinoline-oxalamides) accounts for 16 molecules numerated from 1 to 16, and the second group (4-aminoquinoline-triazines) contains 18 compounds numerated from 17 to 34 (Figure 1). The chemical structures of these hybrid derivatives with their antimalarial activities (IC50) are presented in Tables 1 and 2. The observations are converted into logarithm scale log IC50.
2.2. Molecular Descriptors Calculation
In order to accurately model and predict inhibitors activities, 16 descriptors listed in Table 3 were introduced. Eleven descriptors which are lipophilic, geometrical, physicochemical, and steric descriptors were calculated with the MM2 method with the aid of the ACD/ChemSketch program  and the ChemBioOffice software . On the contrary, 5 electronic descriptors were calculated with the DFT method , using the Gaussian03 quantum chemistry package . The optimization of compounds was performed with the DFT method using Becke's three-parameter hybrid function (B3LYP) , with a 6-31G basis set in the case of electronic descriptors calculation and with the MM2 method for the remaining descriptors. The totality of descriptors used in this work is represented in Table 3.
2.3. Analysis Methods
Multiple linear regression (MLR)  analysis with the descendent selection method was used to select the most appropriate descriptors. It is a mathematical technique to study the relation between one dependent variable and several independent variables. The regression method is based on three criteria: correlation of determination (R2), the Fisher ratio value (F), and the root mean square error (RMSE). The MLR model was generated using the software XLSTAT version 2013 . Note that the MRL has been served to select the used descriptors as the input parameters in the artificial neural network (ANN).
The ANN analysis is performed using the SAS JMP package (v8.0, SAS Institute Inc., Cary, NC, USA). The neurons networks are arranged in three layers: The input layer contains six neurons representing the relevant descriptors obtained with the MLR technique, the output layer contains one neuron representing the calculated activities values log IC50, and the hidden layer is composed of 3 neurons determined by ρ = (number of weight)/(number of connection). In this work, we used the ρ value interval 1 < ρ < 3 [19, 20].
The high correlation coefficient indicates the quality of the equation that fit the data, in order to explore the stability of this equation; the cross-validation method with “leave-one-out” was carried out using the ANN method. Based on this technique, a number of modified data sets are created by deleting in each case one individual , and thereafter, the corresponding models serve to predict the activity of the removed compound.
The LOO cross-validation coefficient R2 was calculated as follows :where and are the observed and predicted values for the dependent variables, respectively, and is the average observed value.
In order to ensure the reliability of the QSAR model, the Y-randomization test has been used. This approach consists to randomly mix many properties/experimental activities for the learning series using the same descriptors; the new QSAR model is constructed to exclude the possibility of random correlation in the obtained model .
Furthermore, external validation is necessary as the validation method is used to ensure the ability of the QSAR model. However, the data set in this work has been randomly divided into a training set with 28 compounds for the model developed through MLR, and a predicted set with 6 compounds has been reserved to external validation.
The ability of the built model based on the external prediction set was evaluated by , which could be calculated as follows :where and are the predicted and experimental values of the samples for the prediction set, respectively. is the average value for the dependent variable for the training set.
The value of ≥ 0.5 is considered as an indicator of the reliability of the model. However, Golbraikh and Tropsha showed that is not a good parameter to estimate the reliability of the QSAR model. Indeed, an external validation based on the Golbraikh and Tropsha criteria is necessary .
In order to gain insight into the key structural requirements of the antimalarial activity, molecular docking studies are carried out using the AutoDock4.2 program . X-ray crystallography structures of Plasmodium falciparum of the wild type (coded as 1J3I.pdb) and quadruple mutant (coded as 1J3K.pdb) pf-DHFR-TS were obtained from the Protein Data Bank . The minimized protein structures were defined as receptors, and the first step in the preparation of the receptor was the removal of the ligands and the water molecules. In order to simplify the docking analysis, in this docking, the 3D grid was created by the AUTOGRID algorithm  to evaluate the interacting energy between protein ligands. The grid maps were constructed using 60, 60, and 60, pointing in x, y, and z directions, with grid point spacing of 0.375 Å. The center grid box is of 29.39 Å, 5.56 Å, and 52.49 Å, by the ligand location in the complex. Discovery Studio software was used for the 2D and 3D visualizations of the established interactions .
3. Results and Discussion
In this study, we used two random distributions of compounds into the training and test sets. The first training set included 28 compounds, and the corresponding test set included 6 compounds. The selected descriptors values, and predicted activities values using the training set obtained by MLR, ANN, and CV methods, are summarized in Table 4.
3.1. Multiple Linear Regression
The QSAR model of the training set built using the MLR method is represented by the following equation:where is the number of compounds, is the correlation coefficient, is the determination coefficient, RMSE is the root mean square error, and is the Fisher test. The relevant descriptors involved in the MLR model are HOMO energy, total energy, repulsion energy, torsion, critical temperature, and stretch-bend. The corresponding normalized coefficients are presented in Figure 2, and the correlation of the observed activities with the MLR calculated ones is illustrated in Figure 3.
As indicated by the statistical coefficient values of the correlation between the observed and calculated activities based on this model using the training set are quite significant, and the low RMSE indicates that the model is reliable to a better prediction precision.
3.2. Artificial Neural Networks
In order to increase the probability of good characterization of studied compounds, artificial neural networks (ANN) are used as the nonlinear method to generate predictive nonlinear model between observed antimalarial activities values and the set of molecular descriptors obtained by MLR with that of the architecture network (6-3-1). The correlation of the observed activities with the ANN predicted ones is illustrated graphically in Figure 4.
As it is shown in Figure 4, a good correlation between observed antimalarial activities values and predicted activities by ANN is obtained, in fact the correlation coefficient R = 0.98, the determination coefficient R2 = 0.97, and the standard error of estimate RMSE = 0.09. Such results show that the selected descriptors by MLR are pertinent, the ANN model possesses a significantly statistical quality, and the model proposed to predict antimalarial activity is relevant.
3.3. Cross Validation
The QSAR model proposed to predict the activity of new compounds should be tested. To validate our results, we used the LOO procedure, which involves removing a single molecule from the set containing 28 molecules and making a prediction for antimalarial activity. This procedure is repeated 28 times in order to estimate the predictive ability of such models. The correlation of the observed activities with the calculated cross-validation ones is shown graphically in Figure 5.
The obtained correlation (R = 0.90, R2 = 0.81, and RMSE = 0.16) shows a high predictive power of the MLR model. This result shows that our QSAR model is not sensitive to this operation of putting a molecule aside and putting it back into the learning series. This is a first indication of the stability of the selected QSAR model.
The Y-randomization test was performed to make sure that there is no random correlation. In this way, we could test the validity of the established QSAR model and check that the selected descriptors are not random, and consequently, the result model should have low statistical quality. The results of the Y-randomization method are given in Table 5 and Figure 6.
The new QSAR model built using the Y-randomization method is represented by the following equation:
The correlation coefficient value of the mixture samples is close to that obtained by applying the model by the training set. This result provides the absence of dependence between descriptors included in the QSAR model.
3.5. External Validation
In a study on efficient methods of validation for QSAR models, Golbraikh and Tropsha showed that LOO methods are necessary but not sufficient, claiming that external validation is inevitable and proposed some criteria which would help to validate a QSAR model. This validation is done in two steps: validation of the model MLR have calculated new compounds which are not used in the model development of the training set (Table 6) and verification of the Tropsha criteria (Table 7).
The results show that Golbraikh and Tropsha criteria are successfully validated. All validations indicate that the built QSAR model is robust and satisfactory. The model established in this study meets all of the principles for QSAR validation and can be used to predict the antimalarial activity.
3.6. Docking Studies
In a pioneering study on the binding modes and the localization of the principal active sites in wild and mutant protein performed with a potent inhibitor 1,3,5-triazine derivative which is a preclinical molecule called WR99210, it is found that the important sites are located in Ile14, Ala16, Met55, Asp54, Ser108, Ile164, and Tyr170 in the case of the wild type and Ala16, Cys50, Asn51, Cys59, Asn108, Leu164, and Tyr170 in the case of mutant protein . In a tentative to give insight into the interaction modes and to find out the interaction types established with this protein (pf-DHFR-TS) in its two forms, wild and mutant, the molecular docking study performed in this work is applied on three compounds 9 (IC50 = 15.58), 11 (IC50 = 261.84), and 26 (IC50 = 5.23) with the binding sites of both wild type and quadruple mutant. The docking results and docked conformations of ligands in the active sites are represented in Figure 7.
In the case of the wild type, compound 26 performs hydrogen bonding with the carboxylate oxygen atoms of ILE164, SER108, and SER111 by the involvement of the two NH groups bounded to the triazine group and one of the triazine nitrogen, with, respectively, the distances 2.49 Ǻ, 2.77 Ǻ, and 2.93 Ǻ, and a nonbonded p-sigma interaction between phenyl of quinoline with MET55 at a distance of 3.52 Ǻ. However, in the case of the quadruple mutant, three hydrogen bonds with ASN108, SER111, and ALA16 were observed by the involvement of two NH groups linked to 1,3,5-triazine and the oxygen of the morpholino group with a distance of 2.85 Ǻ, 1.84 Ǻ, and 2.91 Ǻ, respectively. For the compound 9, two hydrogen bonds are formed between an oxygen and an azote of the oxalamide group with ILE164 and TYR170 with, respectively, the distances 2.24 Ǻ and 2.73 Ǻ in the case of the wild type. But in the case of the quadruple mutant, it forms two hydrogen bonds with ALA16 and LEU164 through the involvement of two azotes (the first is linked to the oxalamide group, and the second belongs to the diethylamine group), with distances of 2.77 Ǻ and 2.80 Ǻ, respectively. However, compound 11 showed only one hydrogen bonding interaction with LEU40 in both cases.
In the analysis of these results, we have at first observed that the residues with which the compounds 26 and 9 have formed their interactions are mentioned as the most important binding sites for antimalarial activity , which is not the case for compound 11. Secondly, we observed that the number of hydrogen bonds differs from the most active compound which belongs to the triazine family, to the less active compound which belongs to the oxalamide family. So, this could explain the potent antimalarial activity for compound 26 and the importance of the triazine group to enhance the antimalarial activity compared to the oxalamide group.
The present study on a series of 4-aminoquinoline-triazines and 4-aminoquinoline-oxalamides was carried out using 3D-QSAR and docking techniques in the aim to predict the antimalarial activity. The group contribution method (for both training and test sets) was used to develop a reliable QSAR model for predicting antimalarial activity. The result of MLR and ANN methods using the training set clearly shows a strong relationship between the structural properties and the activity. Thus, the correlation coefficient for both methods shows good predictive ability of the model. The model is validated by internal and external validation methods including (leave-one-out) cross validation and Y-randomization. The obtained model shows good quality of the robustness to predicting the antimalarial activity. The observed activity was further corroborated via a molecular docking study which gave explanation to the differences observed among activities of compounds especially between the triazine family and oxalamide one. Results of these studies provided details of the predicted binding modes and the key molecular interactions. These will provide opportunities for medicinal chemists to develop new antimalarial drugs, by using new hybrid molecules.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
World Health Organization, World Malaria Report 2015, World Health Organization, Geneva, Switzerland, 2016.
A. Mishra, H. Batchu, K. Srivastava, P. Singh, P. K. Shukla, and S. Batra, “Synthesis and evaluation of new diaryl ether and quinoline hybrids as potential antiplasmodial and antimicrobial agents,” Bioorganic and Medicinal Chemistry Letters, vol. 24, no. 7, pp. 1719–1723, 2014.View at: Publisher Site | Google Scholar
ACD/ChemSketch, ACD/ChemSketch for Academic and Personal Use: ACD/Labs.com, (n.d.), February 2018, http://www.acdlabs.com/resources/freeware/chemsketch/.
ChemOffice, ChemOffice Professional–PerkinElmer Informatics Logiciels, (n.d.), 2017, http://www.cambridgesoft.com/Ensemble_for_Chemistry/ChemOffice/ChemOfficeProfessional/.
R. G. Parr and W. Yang, Density-Functional Theory of Atoms and Molecules, Oxford University Press, Oxford, UK, 1989.
M. J. Frisch, G. W. Trucks, H. B. Schlegel et al., Gaussian 03, Revision C.02, Gaussian Inc., Wallingford, CT, USA, 2004.
R. B. Darlington, Regression and Linear Models, McGraw-Hill, New York, NY, USA, 1990.
XLSTAT, XLSTAT Logiciel de Statistique Pour MS Excel | Logiciel Statistique Pour Excel, (n.d.), February 2018, https://www.xlstat.com/fr/.
S. S. So and W. G. Richards, “Application of neural networks: quantitative structure-activity relationships of the derivatives of 2,4-diamino-5-(substituted-benzyl)pyrimidines as DHFR inhibitors,” Journal of Medicinal Chemistry, vol. 35, no. 17, pp. 3201–3207, 1992.View at: Publisher Site | Google Scholar
RCSB PDB, RCSB PDB: Homepage, (n.d.), 2018, http://www.rcsb.org/.
DS BIOvIA, Discovery Studio Modeling Environment, vol. 4, Dassault Syst Release, San Diego, CA, USA, 2015.