Research Article  Open Access
Exploring QSAR for Antimalarial Activities and Drug Distribution within Blood of a Series of 4Aminoquinoline Drugs Using GeneticMLR
Abstract
Malaria has been one of the most significant public health problems for centuries. QSAR modeling of the antimalarial activity and bloodtoplasma concentration ratio of Chloroquine and a new series of 4aminoquinoline derivatives were developed using genetic algorithms with multiple linear regression (GAMLR) method. We obtained two different models against Chloroquinesensitive (3D7) and Chloroquineresistant (W2) strains of Plasmodium falciparum with good adjustment levels. Drug distribution in blood, defined as drug bloodtoplasma concentration ratio (), is related to molecular descriptors. Leavemanyout (LMO) and Yrandomization methods confirmed the models' robustness.
1. Introduction
With approximately 243 million cases and 863,000 attributed deaths reported globally in 2009 [1], malaria is one of the most severe infectious diseases; primarily affecting the world’s most disadvantaged populations. Chloroquine (CQ), a lowcost drug, is widely used for antimalarial agents. However, the emergence of CQresistant malarial parasite strains has prompted the search for alternative strategies to combat the disease. The clones of P. falciparum were used most often for in vitro testing of the antimalarial activity on different strains, among which are Chloroquinesensitive (NF54, NF54/64, 3D7, D6, F32, D10, HB3, FCC1HN, Ghana), Chloroquineresistant (FcB1, W2, FCM29, BHz26/86, Dd2, EN36, ENT30, FCR3, FCR3/A2), and/or multidrugresistant (K1, TM91C235) strains, to find effective compounds against resistant malaria. Some recent QSAR models and reviews are reported on antimalarial compounds [2–10].
Bloodtoplasma concentration ratio ( defined as ) is a measure of the drug distribution within the blood. Drugs, when reaching the blood stream, can bind to plasma proteins and/or to blood cells. If a drug binding in plasma exceeds its binding in blood cells, values are below 1 (). When drug binding in blood cells exceeds its plasma binding (), then values are above 1. Red blood cells (RBCs) are the host cells for malaria parasites, and any effect of the drug on red cell membranes might be relevant for its in vivo effects [11]. may be an important parameter in drug potency and is therefore worthy of investigation. It is related to either the volume of distribution or clearance of the drug. Even though the determination of is relatively simple, such data is absent in most pharmacokinetic studies [12, 13].
The objective of this study was first to develop QSAR models and to explain the antimalarial activity of a new series of 4aminoquinoline, structurally related to CQ, against P. falciparum various clones (3D7, W2) in vitro using theoretical molecular descriptors. Second, the aim was to establish regression models to predict the bloodtoplasma concentration ratio () using mainly in silico molecular descriptors.
2. Materials and Methods
2.1. Software
A Pentium IV personal computer (CPU, 3.2 GHz) under Windows XP operating system was used. Molecular modeling and geometry optimization were employed by Hyperchem [14]. Dragon software [15] was employed for calculation of theoretical molecular descriptors. SPSS software [16] was used for MLR analysis. Other statistics calculations were also performed in the MATLAB [17] environment.
2.2. Ensemble Data and Molecular Descriptors
We used a series of 4aminoquinoline antimalarial compounds with experimentally determined ADME properties, taken from the Ray et al. paper [18]. Based on the results of their research group, antimalarial compounds effective against drugresistant strains of P. falciparum by varying the chemical substitutions around the heterocyclic ring and the basic amine side chain of the popular antimalarial drug chloroquine have developed [19, 20]. Recently, they have screened a panel of these novel antimalarial compounds for improved leads based on the evaluated ADMET properties [18]. Each compound in the studied database was characterized by growth inhibition of 3D7 and W2 strains of P. falciparum, bloodtoplasma concentration ratio. Figure 1 depicts all the structures used in this study. The panel includes a small number of CQ analogues with altered substitutions on the quinoline ring, although the majority of the compounds in the panel contain substitutions of the alkyl groups attached to the basic nitrogen position on the aminoalkyl side of the chain. Two data sets of log IC_{50} (each compound at 1 and 10 μM concentration) were used for the QSAR studies. The activity data has been given as IC_{50} (nM) values, referring to growth inhibition of chloroquine derivatives uptake into drugresistant (W2) and drugsensitive (3D7) strains of P. falciparum. The experimental values of these antimalarial activities are shown in Table 1. The red blood cells (RBCs) to plasma partition ratio ( defined as ) were measured for each compound at 1 and 10 μM. The values were normalized by transforming them to the logarithm of drug concentration in blood cells to plasma ratio (). The values are summarized in Table 2. The molecular structures of all the Chloroquine derivatives were built with Hyperchem (Version 7, HyperCube, Inc.) software. AM1 semiempirical calculation was used to optimize the 3D geometry of the molecules. The PolakRibier algorithm with root mean squares gradient 0.1 kcal/mol was selected for optimization. By using DRAGON [15] we derived a total of 1481 1D, 2D, and 3D molecular descriptors from the 3D structure of each compound.
 
Number of compounds given in Figure 1. Values calculated by equations in Table 10. Observed minus calculated values. 
 
^{a}Number of compounds given in Figure 1. ^{b}Values calculated by equations in Table 12. ^{c}Observed minus calculated values. 
To decrease the redundancy existed in the descriptors data matrix, the correlation of descriptors with each other and with the properties of the drugs was examined, and collinear descriptors (i.e., ) were detected. Among the collinear descriptors, one with the highest correlation with activity was retained, and the others were removed from the data matrix.
The list and meaning of the molecular descriptors is provided by the DRAGON package, and the calculation procedure is explained in detail, with related literature references, in the Handbook of Molecular Descriptors [21].
2.3. MLR Modeling Procedure
Multiple Linear Regression (MLR) which demonstrates great ease of implementation along with the interpretability of resulting equations were the statistical method of choice for building the QSAR model. The forwardstepping variant of MLR was utilized, starting with the selection of a single variable which contributes most to the model based on its highest statistics or lowest value. At each step, MLR alters the model from the previous step by adding predictor variables and terminating the search when a statistically significant model has been obtained [22, 23]. QSAR Modeling [24] is free JAVAbased software developed by the courtesy of the Theoretical and Applied Chemometrics Laboratory’s research group. Genetic algorithm (GA) search was carried out exploring MLR models. The GA used was the same as that previously used [25, 26].
3. Results and Discussions
3.1. The Selected Descriptors
The majority of the selected descriptors in our GAMLR modeling are composite descriptors, which can be divided into five groups: GETAWAY, 3DMoRSE, RDF, WHIM, and 2D autocorrelations descriptors. The GETAWAY (Geometry, Topology, and Atom Weights AssemblY) try to match the 3D molecular geometry provided by the molecular influence matrix and atom relatedness by topology with chemical information by using various atomic weighting schemes. 3DMoRSE descriptors, which are representations of the 3D structure of a molecule and encode features such as molecular weight, van der Waals volume, electronegativities, and polarizabilities. The radial distribution function (RDF) descriptors are based on the distance distribution of the compounds. WHIM descriptors are based on statistical indices calculated on the projections of atoms along principal axes. 2D autocorrelations descriptors, in general, explain how the considered property is distributed along the topological structure. Three spatial autocorrelation vectors including unweighted and weighted Moran and Geary and BrotoMoreau autocorrelation vectors were calculated. The physicochemical property was considered in atomic masses (m), atomic van der Waals volumes (v), atomic Sanderson electronegativities (e), and atomic polarizabilities (p) as weighting properties [21]. Table 3 depicts the names and meanings of the molecular descriptors used in this work.

Tables 4 and 5 show the data of the descriptors used in this study. The correlation matrixes of the descriptors used in this study are given in Tables 6, 7, 8, and 9. Inspection of these results shows that all the values deviate from unity are noticeable so there is no significant correlation between the independent variables.






3.2. Validation of the Models
A good fit was assessed based on the determination squared correlation coefficients (), adjusted determination coefficient (), standard deviation (s), rootmeansquare error (RMSE), Fisher’s statistic (F) and number of variables. Most of the QSAR modeling methods implement the leaveoneout (LOO) or leavemanyout (LMO) crossvalidation procedure, which are internal validation techniques [27]. LOO crossvalidation procedure consists of removing one data point from the training set and constructing the model only on the basis of the remaining training data and then testing on the removed point. LMO crossvalidation procedure calculate the models leaving multiple observations out at a time, reducing the number of times it has to recalculate a model. The outcome from the crossvalidation procedure is crossvalidated (LOO or LMO), which is used as a criterion of both robustness and predictive ability of the model. In this paper, we have performed the LOO crossvalidation and leave5out crossvalidation method as the internal validation tool. The robustness of the model was examined by the Yrandomization test [28]. For the Yrandomization test, performed ten times, ≤ 0.3 and ≤ 0.05 for all results were considered acceptable. These limits were selected based on Eriksson and coworkers’ suggestions [28]. The Yrandomization test is capable of verifying if models with high values of and present chance correlation [29, 30].
In order to make more realistic validation of the predictive power of the models, external validation was also performed. For that purpose, six Chloroquine derivatives (3, 6, 8, 15, 18 and 19) were selected from 21 compounds at random to construct the external test set, and the remaining 15 Chloroquine derivatives comprised the training set that was employed to calibrate the QSAR models.
3.3. QSAR Models for 2D7 and W2 Strains
By using the best multilinear regression method equations for both antimalarial activities against Chloroquinesensitive (3D7) and Chloroquineresistant (W2) strains of P. falciparum were constructed with up to five descriptors. The predicted log values and the residuals for the compounds are listed in Table 1. QSAR models generated for the two strains (3D7, W2) are shown in Table 10. These models have good capacity to explain the observed values of biological activity because it possesses excellent adjustment level: high correlation coefficient and low rootmeansquare error (= 0.94, = 0.92 and RMSE = 0.14 for 3D7 strain and = 0.94, = 0.91, and RMSE = 0.16 for W2 strain). To validate the selected prediction function, a crossvalidation, and an external test were carried out. The models also have good predictive capacity ( = 0.86 for the both strains). In general, MLR models were able to explain data variance and were quite stable to the inclusionexclusion of compounds as measured by LOO correlation coefficients ( > 0.5). Also, the results of the LMO test are collected in Table 4. From a theoretically acceptable model the cannot have smaller values than and or . Overall, the best model is achieved when ≤ ≥ and . Yrandomization results are in agreement with the suggested limits [28]. This indicates that the explained variance by the model is not due to chance correlation. Yrandomization results are shown in Figures 2 and 3. Each of related training set equations and statistical parameters is summarized in Table 11. In turn, plots of LOO crossvalidation and test set predictions versus experimental log IC_{50} values (for 3D7 and W2 strains) for the MLR models are shown in Figure 4.


(a)
(b)
3.4. QSAR Model for BloodtoPlasma Concentration Ratio
The best linear models consisted of the five descriptors in order to relate them to the log values tabulated in Table 12. The predicted values and the residuals for the compounds are listed in Table 2. As can be seen, the MLR models have good statistical quality with low prediction error. The models obtained were validated by calculating the crossvalidated values obtained using the LOO crossvalidation method. This is the measure of the predictive power of regression equations. The values for the best regression models for log were suggestive of robust models. The results of the LMO test are collected in Table 3. On average, the overall test steps and which is another proof that the model is not underdetermined. The model was further validated by applying the Yrandomization. Several random shuffles of the Y vector were performed. Yrandomization results are in agreement with the suggested limits [28]. Yrandomization results are shown in Figures 5 and 6. The prediction ability of the MLR models were also tested using the validation set of data (Table 13). The correlations between the predicted and experimental values of (from LOO crossvalidation and external test) are shown in Figure 7.


(a)
(b)
4. Conclusions
A quantitative structureactivity relationship (QSAR) study was applied to the series of 4aminoquinoline antimalarial compounds potentially active against the 3D7 and W2 strains of P. falciparum. For each strain, statistically significant models were obtained using the GAbased MLR method. These models may be considered as mathematical equations for the prediction of antimalarial activities of the compounds structurally similar to those used in this study. Models based on GAMLR were developed to predict the bloodtoplasma concentration ratio of the analogues based on selected molecular descriptors. The predictive ability of the test and its validation set were confirmed by the models. The LOO and LMO crossvalidation methods, the Yrandomization technique, and the external validation indicated that the model is significant, robust, and has good internal and external predictability. The use of these models may be an important tool in early drug discovery by providing a relevant pharmacokinetic parameter.
Acknowledgments
The authors thank the Young Researchers Club, Hamedan Branch of Islamic Azad University for financial support. The authors wish to thank Professor E. B. de Melo and Dr. R. Ghavami for their precious help on this work. Anonymous reviewers are gratefully acknowledged for their helpful suggestions that have led to improving the paper.
References
 World Health Organization, “World malaria report,” 2009, http://www.who.int/malaria/publications/atoz/9789241563901/en/index.html. View at: Google Scholar
 A. R. Katritzky, O. V. Kulshyn, I. StoyanovaSlavova et al., “Antimalarial activity: a QSAR modeling using CODESSA PRO software,” Bioorganic and Medicinal Chemistry, vol. 14, no. 7, pp. 2333–2357, 2006. View at: Publisher Site  Google Scholar
 R. GarcíaDomenech, W. LópezPeña, Y. SanchezPerdomo et al., “Application of molecular topology to the prediction of the antimalarial activity of a group of uracilbased acyclic and deoxyuridine compounds,” International Journal of Pharmaceutics, vol. 363, no. 12, pp. 78–84, 2008. View at: Publisher Site  Google Scholar
 H. Ojha, P. Gahlot, A. K. Tiwari, M. Pathak, and R. Kakkar, “Quantitative structure activity relationship study of 2,4,6trisubstitutedstriazine derivatives as antimalarial inhibitors of Plasmodium falciparum dihydrofolate reductase,” Chemical Biology and Drug Design, vol. 77, no. 1, pp. 57–62, 2011. View at: Publisher Site  Google Scholar
 P. K. Ojha and K. Roy, “Chemometric modeling, docking and in silico design of triazolopyrimidinebased dihydroorotate dehydrogenase inhibitors as antimalarials,” European Journal of Medicinal Chemistry, vol. 45, pp. 4645–4656, 2010. View at: Publisher Site  Google Scholar
 P. K. Ojha and K. Roy, “Chemometric modelling of antimalarial activity of aryltriazolylhydroxamates,” Molecular Simulation, vol. 36, no. 12, pp. 939–952, 2010. View at: Publisher Site  Google Scholar
 P. Shah and M. I. Siddiqi, “3DQSAR studies on triclosan derivatives as Plasmodium falciparum enoyl acyl carrier reductase inhibitors,” SAR and QSAR in Environmental Research, vol. 21, no. 56, pp. 527–545, 2010. View at: Publisher Site  Google Scholar
 S. C. Basak, D. Mills, D. M. Hawkins, and A. K. Bhattacharjee, “Quantitative structureactivity relationship studies of antimalarial compounds from their calculated mathematical descriptors,” SAR and QSAR in Environmental Research, vol. 21, no. 12, pp. 103–125, 2010. View at: Publisher Site  Google Scholar
 K. Roy and P. K. Ojha, “Advances in quantitative structureactivity relationship models of antimalarials,” Expert Opinion on Drug Discovery, vol. 5, no. 8, pp. 751–778, 2010. View at: Publisher Site  Google Scholar
 K. Roy and P. P. Roy, “QSAR of cytochrome inhibitors,” Expert Opinion on Drug Metabolism and Toxicology, vol. 5, no. 10, pp. 1245–1266, 2009. View at: Publisher Site  Google Scholar
 W. Asawamahasakda, A. Benakis, and S. R. Meshnick, “The interaction of artemisinin with red cell membranes,” Journal of Laboratory and Clinical Medicine, vol. 123, no. 5, pp. 757–762, 1994. View at: Google Scholar
 R. J. Riley, D. F. McGinnity, and R. P. Austin, “A unified model for predicting human hepatic, metabolic clearance from in vitro intrinsic clearance data in hepatocytes and microsomes,” Drug Metabolism and Disposition, vol. 33, no. 9, pp. 1304–1311, 2005. View at: Publisher Site  Google Scholar
 P. Paixão, L. F. Gouveiaa, and J. A. G. Moraisa, “Prediction of drug distribution within blood,” European Journal of Pharmaceutical Sciences, vol. 36, pp. 544–554, 2009. View at: Publisher Site  Google Scholar
 HyperChem. 7.0, Hypercube Incorporation, http://www.hyper.com.
 R. Todeschini, Milano Chemometrics and QSAR Group, http://www.talete.mi.it.
 SPSS, 16.0, SPSS Incorporation, http://www.spss.com.
 MATLAB, 7.0, MathWorks Incorporation, http://www.mathworks.com.
 S. Ray, B. Madrid, P. Catz et al., “Development of a new generation of 4aminoquinoline antimalarial compounds using predictive pharmacokinetic and toxicology models,” Journal of Medicinal Chemistry, vol. 53, pp. 3685–3695, 2010. View at: Publisher Site  Google Scholar
 P. B. Madrid, A. P. Liou, J. L. DeRisi, and R. K. Guy, “Incorporation of an intramolecular hydrogenbonding motif in the side chain of 4aminoquinolines enhances activity against drugresistant P. falciparum,” Journal of Medicinal Chemistry, vol. 49, no. 15, pp. 4535–4543, 2006. View at: Publisher Site  Google Scholar
 P. B. Madrid, J. Sherrill, A. P. Liou, J. L. Weisman, J. L. DeRisi, and R. K. Guy, “Synthesis of ringsubstituted 4aminoquinolines and evaluation of their antimalarial activities,” Bioorganic and Medicinal Chemistry Letters, vol. 15, no. 4, pp. 1015–1018, 2005. View at: Publisher Site  Google Scholar
 R. Todeschini and V. Consonni, Handbook of Molecular Descriptors, Wiley VCH, London, UK, 2000.
 R. B. Darlington, Regression and Linear Models, McGrawHill, New York, NY, USA, 1990.
 A. Najafi and S. S. Ardakani, “2D autocorrelation modelling of the antiHIV HEPT analogues using multiple linear regression approaches,” Molecular Simulation, vol. 37, no. 1, pp. 72–83, 2011. View at: Publisher Site  Google Scholar
 QSAR Modeling, 2010, Theoretical and Applied Chemometrics Laboratory, State University of Campinas, Campinas, Brazil, http://lqta.iqm.unicamp.br.
 A. Najafi, S. S. Ardakani, and M. Marjani, “Quantitative structureactivity relationship analysis of the anticonvulsant activity of some benzylacetamides based on genetic algorithmbased multiple linear regression,” Tropical Journal of Pharmaceutical Research, vol. 10, no. 4, p. 483, 2011. View at: Publisher Site  Google Scholar
 R. Ghavami, A. Najafi, M. Sajadi, and F. Djannaty, “Genetic algorithm as a variable selection procedure for the simulation of 13C nuclear magnetic resonance spectra of flavonoid derivatives using multiple linear regression,” Journal of Molecular Graphics and Modelling, vol. 27, no. 2, pp. 105–115, 2008. View at: Publisher Site  Google Scholar
 K. Baumann and N. Stiefl, “Validation tools for variable subset regression,” Journal of ComputerAided Molecular Design, vol. 18, no. 7–9, pp. 549–562, 2004. View at: Publisher Site  Google Scholar
 L. Eriksson, J. Jaworska, A. P. Worth, M. T. D. Cronin, R. M. McDowell, and P. Gramatica, “Methods for reliability and uncertainty assessment and for applicability evaluations of classification and regressionbased QSARs,” Environmental Health Perspectives, vol. 111, no. 10, pp. 1361–1375, 2003. View at: Google Scholar
 A. Tropsha, P. Gramatica, and V. K. Gombar, “The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models,” QSAR and Combinatorial Science, vol. 22, no. 1, pp. 69–77, 2003. View at: Google Scholar
 E. B. D. Melo and M. M. C. Ferreira, “Multivariate QSAR study of 4,5dihydroxypyrimidine carboxamides as HIV1 integrase inhibitors,” European Journal of Medicinal Chemistry, vol. 44, no. 9, pp. 3577–3583, 2009. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2013 Amir Najafi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.