Journal of Chemistry

Journal of Chemistry / 2013 / Article

Research Article | Open Access

Volume 2013 |Article ID 560415 | 12 pages |

Exploring QSAR for Antimalarial Activities and Drug Distribution within Blood of a Series of 4-Aminoquinoline Drugs Using Genetic-MLR

Academic Editor: Georgia Melagraki
Received28 Nov 2011
Revised19 Jun 2012
Accepted05 Jul 2012
Published03 Oct 2012


Malaria has been one of the most significant public health problems for centuries. QSAR modeling of the antimalarial activity and blood-to-plasma concentration ratio of Chloroquine and a new series of 4-aminoquinoline derivatives were developed using genetic algorithms with multiple linear regression (GA-MLR) method. We obtained two different models against Chloroquine-sensitive (3D7) and Chloroquine-resistant (W2) strains of Plasmodium falciparum with good adjustment levels. Drug distribution in blood, defined as drug blood-to-plasma concentration ratio ( ), is related to molecular descriptors. Leave-many-out (LMO) and Y-randomization methods confirmed the models' robustness.

1. Introduction

With approximately 243 million cases and 863,000 attributed deaths reported globally in 2009 [1], malaria is one of the most severe infectious diseases; primarily affecting the world’s most disadvantaged populations. Chloroquine (CQ), a low-cost drug, is widely used for antimalarial agents. However, the emergence of CQ-resistant malarial parasite strains has prompted the search for alternative strategies to combat the disease. The clones of P. falciparum were used most often for in vitro testing of the antimalarial activity on different strains, among which are Chloroquine-sensitive (NF54, NF54/64, 3D7, D6, F32, D10, HB3, FCC1-HN, Ghana), Chloroquine-resistant (FcB1, W2, FCM29, BHz26/86, Dd2, EN36, ENT30, FCR3, FCR-3/A2), and/or multidrug-resistant (K1, TM91C235) strains, to find effective compounds against resistant malaria. Some recent QSAR models and reviews are reported on antimalarial compounds [210].

Blood-to-plasma concentration ratio ( defined as ) is a measure of the drug distribution within the blood. Drugs, when reaching the blood stream, can bind to plasma proteins and/or to blood cells. If a drug binding in plasma exceeds its binding in blood cells, values are below 1 ( ). When drug binding in blood cells exceeds its plasma binding ( ), then values are above 1. Red blood cells (RBCs) are the host cells for malaria parasites, and any effect of the drug on red cell membranes might be relevant for its in vivo effects [11]. may be an important parameter in drug potency and is therefore worthy of investigation. It is related to either the volume of distribution or clearance of the drug. Even though the determination of is relatively simple, such data is absent in most pharmacokinetic studies [12, 13].

The objective of this study was first to develop QSAR models and to explain the antimalarial activity of a new series of 4-aminoquinoline, structurally related to CQ, against P. falciparum various clones (3D7, W2) in vitro using theoretical molecular descriptors. Second, the aim was to establish regression models to predict the blood-to-plasma concentration ratio ( ) using mainly in silico molecular descriptors.

2. Materials and Methods

2.1. Software

A Pentium IV personal computer (CPU, 3.2 GHz) under Windows XP operating system was used. Molecular modeling and geometry optimization were employed by Hyperchem [14]. Dragon software [15] was employed for calculation of theoretical molecular descriptors. SPSS software [16] was used for MLR analysis. Other statistics calculations were also performed in the MATLAB [17] environment.

2.2. Ensemble Data and Molecular Descriptors

We used a series of 4-aminoquinoline antimalarial compounds with experimentally determined ADME properties, taken from the Ray et al. paper [18]. Based on the results of their research group, antimalarial compounds effective against drug-resistant strains of P. falciparum by varying the chemical substitutions around the heterocyclic ring and the basic amine side chain of the popular antimalarial drug chloroquine have developed [19, 20]. Recently, they have screened a panel of these novel antimalarial compounds for improved leads based on the evaluated ADMET properties [18]. Each compound in the studied database was characterized by growth inhibition of 3D7 and W2 strains of P. falciparum, blood-to-plasma concentration ratio. Figure 1 depicts all the structures used in this study. The panel includes a small number of CQ analogues with altered substitutions on the quinoline ring, although the majority of the compounds in the panel contain substitutions of the alkyl groups attached to the basic nitrogen position on the aminoalkyl side of the chain. Two data sets of log IC50 (each compound at 1 and 10 μM concentration) were used for the QSAR studies. The activity data has been given as IC50 (nM) values, referring to growth inhibition of chloroquine derivatives uptake into drug-resistant (W2) and drug-sensitive (3D7) strains of P. falciparum. The experimental values of these antimalarial activities are shown in Table 1. The red blood cells (RBCs) to plasma partition ratio ( defined as ) were measured for each compound at 1 and 10 μM. The values were normalized by transforming them to the logarithm of drug concentration in blood cells to plasma ratio ( ). The values are summarized in Table 2. The molecular structures of all the Chloroquine derivatives were built with Hyperchem (Version 7, HyperCube, Inc.) software. AM1 semiempirical calculation was used to optimize the 3D geometry of the molecules. The Polak-Ribier algorithm with root mean squares gradient 0.1 kcal/mol was selected for optimization. By using DRAGON [15] we derived a total of 1481 1D, 2D, and 3D molecular descriptors from the 3D structure of each compound.

log ( ) 3D7log ( ) W2
Exp.Pred. Res. Exp.Pred. Res.


Number of compounds given in Figure 1.
Values calculated by equations in Table 10.
Observed minus calculated values.

log , 1 μMlog , 10 μM
Exp. Exp.


aNumber of compounds given in Figure 1.
bValues calculated by equations in Table 12.
cObserved minus calculated values.

To decrease the redundancy existed in the descriptors data matrix, the correlation of descriptors with each other and with the properties of the drugs was examined, and collinear descriptors (i.e., ) were detected. Among the collinear descriptors, one with the highest correlation with activity was retained, and the others were removed from the data matrix.

The list and meaning of the molecular descriptors is provided by the DRAGON package, and the calculation procedure is explained in detail, with related literature references, in the Handbook of Molecular Descriptors [21].

2.3. MLR Modeling Procedure

Multiple Linear Regression (MLR) which demonstrates great ease of implementation along with the interpretability of resulting equations were the statistical method of choice for building the QSAR model. The forward-stepping variant of MLR was utilized, starting with the selection of a single variable which contributes most to the model based on its highest -statistics or lowest value. At each step, MLR alters the model from the previous step by adding predictor variables and terminating the search when a statistically significant model has been obtained [22, 23]. QSAR Modeling [24] is free JAVA-based software developed by the courtesy of the Theoretical and Applied Chemometrics Laboratory’s research group. Genetic algorithm (GA) search was carried out exploring MLR models. The GA used was the same as that previously used [25, 26].

3. Results and Discussions

3.1. The Selected Descriptors

The majority of the selected descriptors in our GA-MLR modeling are composite descriptors, which can be divided into five groups: GETAWAY, 3D-MoRSE, RDF, WHIM, and 2D autocorrelations descriptors. The GETAWAY (Geometry, Topology, and Atom Weights AssemblY) try to match the 3D molecular geometry provided by the molecular influence matrix and atom relatedness by topology with chemical information by using various atomic weighting schemes. 3D-MoRSE descriptors, which are representations of the 3D structure of a molecule and encode features such as molecular weight, van der Waals volume, electronegativities, and polarizabilities. The radial distribution function (RDF) descriptors are based on the distance distribution of the compounds. WHIM descriptors are based on statistical indices calculated on the projections of atoms along principal axes. 2D autocorrelations descriptors, in general, explain how the considered property is distributed along the topological structure. Three spatial autocorrelation vectors including unweighted and weighted Moran and Geary and Broto-Moreau autocorrelation vectors were calculated. The physicochemical property was considered in atomic masses (m), atomic van der Waals volumes (v), atomic Sanderson electronegativities (e), and atomic polarizabilities (p) as weighting properties [21]. Table 3 depicts the names and meanings of the molecular descriptors used in this work.

GETAWAY descriptors
 H4uH autocorrelation of lag 4/unweighted
 HATS5uleverage-weighted autocorrelation of lag 5/unweighted
 R2u+R maximal autocorrelation of lag 2/unweighted
 R8u+R maximal autocorrelation of lag 8/unweighted
 R3mR autocorrelation of lag 3/weighted by atomic masses
 R4m+R maximal autocorrelation of lag 4/weighted by atomic masses

3D-MoRSE descriptors
 Mor16u3D-MoRSE—signal 16/unweighted
 Mor06m3D-MoRSE—signal 06/weighted by atomic masses
 Mor24m3D-MoRSE—signal 24/weighted by atomic masses
 Mor32v3D-MoRSE—signal 32/weighted by atomic van der Waals volumes

RDF descriptors
 RDF020uRadial Distribution Function—2.0/unweighted
 RDF095uRadial Distribution Function—9.5/unweighted
 RDF040mRadial Distribution Function—4.0/weighted by atomic masses
 RDF090mRadial Distribution Function—9.0/weighted by atomic masses
 RDF095mRadial Distribution Function—9.5/weighted by atomic masses

WHIM descriptors
 G3p3st component symmetry directional WHIM index/weighted by atomic polarizabilities
 GuG total symmetry index/unweighted

2D autocorrelations
 MATS4mMoran autocorrelation—lag 4/weighted by atomic masses
 MATS6mMoran autocorrelation—lag 6/weighted by atomic masses

Tables 4 and 5 show the data of the descriptors used in this study. The correlation matrixes of the descriptors used in this study are given in Tables 6, 7, 8, and 9. Inspection of these results shows that all the values deviate from unity are noticeable so there is no significant correlation between the independent variables.

Log ( ) 3D7RDF020uMor06m R8u+GuHATS5uLog ( ) W2RDF040mH4uMATS6mMor16uMor32v


log B/P, (1 μM)RDF065uGATS8mRDF090m R2u+Mor24m log B/P, (10 μM)R3mRDF095uG3p R4m+MATS4m


RDF020uMor06m R8u+GuHATS5u




RDF065uGATS8mRDF090m R2u+Mor24m


R3mRDF095uG3p R4m+MATS4m


3.2. Validation of the Models

A good fit was assessed based on the determination squared correlation coefficients ( ), adjusted determination coefficient ( ), standard deviation (s), root-mean-square error (RMSE), Fisher’s statistic (F) and number of variables. Most of the QSAR modeling methods implement the leave-one-out (LOO) or leave-many-out (LMO) cross-validation procedure, which are internal validation techniques [27]. LOO cross-validation procedure consists of removing one data point from the training set and constructing the model only on the basis of the remaining training data and then testing on the removed point. LMO cross-validation procedure calculate the models leaving multiple observations out at a time, reducing the number of times it has to recalculate a model. The outcome from the cross-validation procedure is cross-validated (LOO- or LMO- ), which is used as a criterion of both robustness and predictive ability of the model. In this paper, we have performed the LOO cross-validation and leave-5-out cross-validation method as the internal validation tool. The robustness of the model was examined by the Y-randomization test [28]. For the Y-randomization test, performed ten times, ≤ 0.3 and ≤ 0.05 for all results were considered acceptable. These limits were selected based on Eriksson and coworkers’ suggestions [28]. The Y-randomization test is capable of verifying if models with high values of and present chance correlation [29, 30].

In order to make more realistic validation of the predictive power of the models, external validation was also performed. For that purpose, six Chloroquine derivatives (3, 6, 8, 15, 18 and 19) were selected from 21 compounds at random to construct the external test set, and the remaining 15 Chloroquine derivatives comprised the training set that was employed to calibrate the QSAR models.

3.3. QSAR Models for 2D7 and W2 Strains

By using the best multilinear regression method equations for both antimalarial activities against Chloroquine-sensitive (3D7) and Chloroquine-resistant (W2) strains of P. falciparum were constructed with up to five descriptors. The predicted log values and the residuals for the compounds are listed in Table 1. QSAR models generated for the two strains (3D7, W2) are shown in Table 10. These models have good capacity to explain the observed values of biological activity because it possesses excellent adjustment level: high correlation coefficient and low root-mean-square error ( = 0.94, = 0.92 and RMSE = 0.14 for 3D7 strain and = 0.94, = 0.91, and RMSE = 0.16 for W2 strain). To validate the selected prediction function, a cross-validation, and an external test were carried out. The models also have good predictive capacity ( = 0.86 for the both strains). In general, MLR models were able to explain data variance and were quite stable to the inclusion-exclusion of compounds as measured by LOO correlation coefficients ( > 0.5). Also, the results of the LMO test are collected in Table 4. From a theoretically acceptable model the cannot have smaller values than and or . Overall, the best model is achieved when and . Y-randomization results are in agreement with the suggested limits [28]. This indicates that the explained variance by the model is not due to chance correlation. Y-randomization results are shown in Figures 2 and 3. Each of related training set equations and statistical parameters is summarized in Table 11. In turn, plots of LOO cross-validation and test set predictions versus experimental log IC50 values (for 3D7 and W2 strains) for the MLR models are shown in Figure 4.

Log ( )Equation RMSE

3D7−13.327 (1.632) + 0.236 (0.028) RDF020u + 0.280 (0.069) Mor06m + 13.883 (2.181) R8u+ + 63.224 (12.011) Gu − 5.649 (1.022) HATS5u0.940.920.1444.810.860.87

W2−93.816 (12.475) + 0.131 (0.014) RDF040m + 1.285 (0.204) H4u + 86.388 (12.118) MATS6m − 1.021 (0.201) Mor16u − 2.177 (0.581) Mor32v0.940.910.1644.430.860.77

Training setTest set


3.4. QSAR Model for Blood-to-Plasma Concentration Ratio

The best linear models consisted of the five descriptors in order to relate them to the log values tabulated in Table 12. The predicted values and the residuals for the compounds are listed in Table 2. As can be seen, the MLR models have good statistical quality with low prediction error. The models obtained were validated by calculating the cross-validated values obtained using the LOO cross-validation method. This is the measure of the predictive power of regression equations. The values for the best regression models for log were suggestive of robust models. The results of the LMO test are collected in Table 3. On average, the overall test steps and which is another proof that the model is not underdetermined. The model was further validated by applying the Y-randomization. Several random shuffles of the Y vector were performed. Y-randomization results are in agreement with the suggested limits [28]. Y-randomization results are shown in Figures 5 and 6. The prediction ability of the MLR models were also tested using the validation set of data (Table 13). The correlations between the predicted and experimental values of (from LOO cross-validation and external test) are shown in Figure 7.

log Equation RMSE

B/P, (1 μM)0.948 (0.140) − 0.013 (0.002) RDF065u − 33.174 (6.778) GATS8m − 0.055 (0.007) RDF090m + 4.018 (0.582) R2u+ + 0.305 (0.104) Mor24m0.920.890.0432.690.840.85

B/P, (10 μM)−21.744 (6.135) − 3.861 (0.431) R3m + 0.033(0.008) RDF095m + 28.454 (5.396) G3p − 9.616 (2.220) R4m+ + 20.931 (6.131) MATS4m0.870.830.1020.550.740.80

Training setTest set

B/P (1 μM)0.890.830.0514.570.930.06
B/P (10 μM)0.870.800.1112.030.840.09

4. Conclusions

A quantitative structure-activity relationship (QSAR) study was applied to the series of 4-aminoquinoline antimalarial compounds potentially active against the 3D7 and W2 strains of P. falciparum. For each strain, statistically significant models were obtained using the GA-based MLR method. These models may be considered as mathematical equations for the prediction of antimalarial activities of the compounds structurally similar to those used in this study. Models based on GA-MLR were developed to predict the blood-to-plasma concentration ratio of the analogues based on selected molecular descriptors. The predictive ability of the test and its validation set were confirmed by the models. The LOO and LMO cross-validation methods, the Y-randomization technique, and the external validation indicated that the model is significant, robust, and has good internal and external predictability. The use of these models may be an important tool in early drug discovery by providing a relevant pharmacokinetic parameter.


The authors thank the Young Researchers Club, Hamedan Branch of Islamic Azad University for financial support. The authors wish to thank Professor E. B. de Melo and Dr. R. Ghavami for their precious help on this work. Anonymous reviewers are gratefully acknowledged for their helpful suggestions that have led to improving the paper.


  1. World Health Organization, “World malaria report,” 2009, View at: Google Scholar
  2. A. R. Katritzky, O. V. Kulshyn, I. Stoyanova-Slavova et al., “Antimalarial activity: a QSAR modeling using CODESSA PRO software,” Bioorganic and Medicinal Chemistry, vol. 14, no. 7, pp. 2333–2357, 2006. View at: Publisher Site | Google Scholar
  3. R. García-Domenech, W. López-Peña, Y. Sanchez-Perdomo et al., “Application of molecular topology to the prediction of the antimalarial activity of a group of uracil-based acyclic and deoxyuridine compounds,” International Journal of Pharmaceutics, vol. 363, no. 1-2, pp. 78–84, 2008. View at: Publisher Site | Google Scholar
  4. H. Ojha, P. Gahlot, A. K. Tiwari, M. Pathak, and R. Kakkar, “Quantitative structure activity relationship study of 2,4,6-trisubstituted-s-triazine derivatives as antimalarial inhibitors of Plasmodium falciparum dihydrofolate reductase,” Chemical Biology and Drug Design, vol. 77, no. 1, pp. 57–62, 2011. View at: Publisher Site | Google Scholar
  5. P. K. Ojha and K. Roy, “Chemometric modeling, docking and in silico design of triazolopyrimidine-based dihydroorotate dehydrogenase inhibitors as antimalarials,” European Journal of Medicinal Chemistry, vol. 45, pp. 4645–4656, 2010. View at: Publisher Site | Google Scholar
  6. P. K. Ojha and K. Roy, “Chemometric modelling of antimalarial activity of aryltriazolylhydroxamates,” Molecular Simulation, vol. 36, no. 12, pp. 939–952, 2010. View at: Publisher Site | Google Scholar
  7. P. Shah and M. I. Siddiqi, “3D-QSAR studies on triclosan derivatives as Plasmodium falciparum enoyl acyl carrier reductase inhibitors,” SAR and QSAR in Environmental Research, vol. 21, no. 5-6, pp. 527–545, 2010. View at: Publisher Site | Google Scholar
  8. S. C. Basak, D. Mills, D. M. Hawkins, and A. K. Bhattacharjee, “Quantitative structure-activity relationship studies of antimalarial compounds from their calculated mathematical descriptors,” SAR and QSAR in Environmental Research, vol. 21, no. 1-2, pp. 103–125, 2010. View at: Publisher Site | Google Scholar
  9. K. Roy and P. K. Ojha, “Advances in quantitative structureactivity relationship models of antimalarials,” Expert Opinion on Drug Discovery, vol. 5, no. 8, pp. 751–778, 2010. View at: Publisher Site | Google Scholar
  10. K. Roy and P. P. Roy, “QSAR of cytochrome inhibitors,” Expert Opinion on Drug Metabolism and Toxicology, vol. 5, no. 10, pp. 1245–1266, 2009. View at: Publisher Site | Google Scholar
  11. W. Asawamahasakda, A. Benakis, and S. R. Meshnick, “The interaction of artemisinin with red cell membranes,” Journal of Laboratory and Clinical Medicine, vol. 123, no. 5, pp. 757–762, 1994. View at: Google Scholar
  12. R. J. Riley, D. F. McGinnity, and R. P. Austin, “A unified model for predicting human hepatic, metabolic clearance from in vitro intrinsic clearance data in hepatocytes and microsomes,” Drug Metabolism and Disposition, vol. 33, no. 9, pp. 1304–1311, 2005. View at: Publisher Site | Google Scholar
  13. P. Paixão, L. F. Gouveiaa, and J. A. G. Moraisa, “Prediction of drug distribution within blood,” European Journal of Pharmaceutical Sciences, vol. 36, pp. 544–554, 2009. View at: Publisher Site | Google Scholar
  14. HyperChem. 7.0, Hypercube Incorporation,
  15. R. Todeschini, Milano Chemometrics and QSAR Group,
  16. SPSS, 16.0, SPSS Incorporation,
  17. MATLAB, 7.0, MathWorks Incorporation,
  18. S. Ray, B. Madrid, P. Catz et al., “Development of a new generation of 4-aminoquinoline antimalarial compounds using predictive pharmacokinetic and toxicology models,” Journal of Medicinal Chemistry, vol. 53, pp. 3685–3695, 2010. View at: Publisher Site | Google Scholar
  19. P. B. Madrid, A. P. Liou, J. L. DeRisi, and R. K. Guy, “Incorporation of an intramolecular hydrogen-bonding motif in the side chain of 4-aminoquinolines enhances activity against drug-resistant P. falciparum,” Journal of Medicinal Chemistry, vol. 49, no. 15, pp. 4535–4543, 2006. View at: Publisher Site | Google Scholar
  20. P. B. Madrid, J. Sherrill, A. P. Liou, J. L. Weisman, J. L. DeRisi, and R. K. Guy, “Synthesis of ring-substituted 4-aminoquinolines and evaluation of their antimalarial activities,” Bioorganic and Medicinal Chemistry Letters, vol. 15, no. 4, pp. 1015–1018, 2005. View at: Publisher Site | Google Scholar
  21. R. Todeschini and V. Consonni, Handbook of Molecular Descriptors, Wiley- VCH, London, UK, 2000.
  22. R. B. Darlington, Regression and Linear Models, McGraw-Hill, New York, NY, USA, 1990.
  23. A. Najafi and S. S. Ardakani, “2D autocorrelation modelling of the anti-HIV HEPT analogues using multiple linear regression approaches,” Molecular Simulation, vol. 37, no. 1, pp. 72–83, 2011. View at: Publisher Site | Google Scholar
  24. QSAR Modeling, 2010, Theoretical and Applied Chemometrics Laboratory, State University of Campinas, Campinas, Brazil,
  25. A. Najafi, S. S. Ardakani, and M. Marjani, “Quantitative structure-activity relationship analysis of the anticonvulsant activity of some benzylacetamides based on genetic algorithm-based multiple linear regression,” Tropical Journal of Pharmaceutical Research, vol. 10, no. 4, p. 483, 2011. View at: Publisher Site | Google Scholar
  26. R. Ghavami, A. Najafi, M. Sajadi, and F. Djannaty, “Genetic algorithm as a variable selection procedure for the simulation of 13C nuclear magnetic resonance spectra of flavonoid derivatives using multiple linear regression,” Journal of Molecular Graphics and Modelling, vol. 27, no. 2, pp. 105–115, 2008. View at: Publisher Site | Google Scholar
  27. K. Baumann and N. Stiefl, “Validation tools for variable subset regression,” Journal of Computer-Aided Molecular Design, vol. 18, no. 7–9, pp. 549–562, 2004. View at: Publisher Site | Google Scholar
  28. L. Eriksson, J. Jaworska, A. P. Worth, M. T. D. Cronin, R. M. McDowell, and P. Gramatica, “Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs,” Environmental Health Perspectives, vol. 111, no. 10, pp. 1361–1375, 2003. View at: Google Scholar
  29. A. Tropsha, P. Gramatica, and V. K. Gombar, “The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models,” QSAR and Combinatorial Science, vol. 22, no. 1, pp. 69–77, 2003. View at: Google Scholar
  30. E. B. D. Melo and M. M. C. Ferreira, “Multivariate QSAR study of 4,5-dihydroxypyrimidine carboxamides as HIV-1 integrase inhibitors,” European Journal of Medicinal Chemistry, vol. 44, no. 9, pp. 3577–3583, 2009. View at: Publisher Site | Google Scholar

Copyright © 2013 Amir Najafi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1998 Views | 801 Downloads | 1 Citation
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19.