Abstract

Quantum chemical parameters such as LUMO energy, HOMO energy, ionization energy (I), electron affinity (A), chemical potential (), hardness () electronegativity (), philicity (), and electrophilicity () of a series of aliphatic compounds are calculated at the B3LYP/6-31G(d) level of theory. Quantitative structure-activity relationship (QSAR) models are developed for predicting the toxicity () of 13 classes of aliphatic compounds, including 171 electron acceptors and 81 electron donors, towards Tetrahymena pyriformis. The multiple linear regression modeling of toxicity of these compounds is performed by using the molecular descriptor log P (1-octanol/water partition coefficient) in conjunction with two other quantum chemical descriptors, electrophilicity () and energy of the lowest unoccupied molecular orbital (). A comparison is made towards the toxicity predicting the ability of electrophilicity () versus as a global chemical reactivity descriptor in addition to log P. The former works marginally better in most cases. There is a slight improvement in the quality of regression by changing the unit of from mg/L to molarity and by removing the racemates and the diastereoisomers from the data set.

1. Introduction

The quantitative structure-activity relationship (QSAR) analysis is aimed at deriving empirical models that relate the activity of chemical compounds to their structure [2]. The underlying assumption is that the chemical structure of a compound implicitly determines its behavior towards biological systems. Appropriate structural or functional descriptors are used to represent the chemical structure and the analysis results in a mathematical model describing the relationship between the chemical structure and the biological activity. Different types of descriptors have been employed, which are of constitutional, geometrical, topological, electrotopological, steric, electrostatic, electronic, and quantum chemical origins. The most essential scientific purpose of developing a QSAR model includes: (1) understanding the mechanism of interaction between compounds and biological systems, (2) gaining information about a dose range for the biological effect of a chemical compound which in turn can be useful in the experimental drug design and toxicity testing, and (3) the prediction of the activity of new chemical compounds. Further, QSAR models can save time and experimental resources for synthesizing and biological testing of a large number of compounds and offer possibility of reduction or replacement of animal use in research and toxicity testing. Various statistical methods are used in QSAR analysis. These methods include regression analysis, partial least squares, classification trees, and neural networks [3].

For the development of a useful QSAR model, the foremost important thing is to assess the mode of biochemical action of the toxicant on the biological system, at cellular and molecular levels. There are many approaches to evaluate the mechanistic basis of toxicity. Some of those methods are: in vitro tests [4], joint toxicity tests [5], fish acute toxicity syndromes [6], and the mechanism evaluated on the basis of structural parameters. The mechanism of toxicity ranges from noncovalent effects to electrophilic one involving covalent binding with biological macromolecules. Among varied modes of toxic action, the narcotic mechanism involves the nonspecific non-covalent reversible interactions of the toxicants with cell membranes [7]. Nonpolar narcotics are neutral nonreactive compounds such as aliphatic alcohols, ketones, ethers, and so forth, whose toxic effect is assumed to be determined mainly by the lipid solubility [8]. Polar narcotics are less inert aromatic chemical species, such as phenols and anilines, which usually posses a hydrogen donor group [9].

A large number of QSAR studies of acute toxicity have been reported in the literature [10]. Many authors [1115] have reported quantitative relationship between toxicity and hydrophobicity, wherein the hydrophobicities are represented by octanol-water partition coefficient ( values) or octanol-water distribution coefficient ( values) as descriptors. These model relationships are assumed to represent a “baseline effect,” whereby no completely soluble and nonvolatile chemical compound can exhibit toxicity less than that predicted by such relationships. Schultz et al. [16] have investigated the toxicity of a large data set of 500 aliphatic chemicals towards the protozoan Tetrahymena pyriformis in terms of their IGC50 values using octanol-water partition coefficient. Some authors [17] have reported that dimyristoyl phosphatidylcholine-water partition coefficients give better statistical fit than octanol-water partition coefficients in QSAR inhibition of T. pyriformis population growth for nonpolar narcotics, polar narcotics, and esters. Roberts and Costello [18] have developed QSAR models for the toxicity prediction of 18 nonpolar and polar narcotics to the fish Poecilia reticulata using (octanol-water partition coefficient) and (membrane-water partition coefficients). Freidig and Hermens [19] have reported QSAR models for the toxicity prediction in the cases of Poecilia. reticulata (14 day LC50) and Pimephales promelas (4 day LC50). These authors have developed separate one parameter QSAR models for a group of narcotics and reactive compounds, using as a descriptor for the narcotics and an electronic descriptor for the reactive compounds.

Response-surface approach has been widely used for the development of mechanistically comprehensible QSAR models for toxicity. The basic premise of this approach is that the toxic action depends on the biouptake and bioavailability as well as on the electrophilic reactivity of the toxicant at an active site. Researchers have employed or as a descriptor encoding biouptake and availability and energy of the lowest unoccupied molecular orbital () or maximum acceptor superdelocalisability () as descriptor encoding the electrophilic reactivity. This approach has been applied to different species, including the bacterium Vibrio fischeri [20], the protozoan Tetrahymena pyriformis [21, 22], the yeast Saccharomyces cerevisiae [23], the mould Aspergillus nidulans [24], the algae Scenedesmus obliquus [25] and chlorella vulgaris [24], the plant Cucumis sativus [26, 27], and mice [24]. The response surface approach has been extended by adding additional indicator variables and other parameters to improve the statistical fit of the models [28, 29].

Our group has carried out toxicity analysis of a diverse class of systems using conceptual density functional theory-based reactivity/selectivity descriptors like electronegativity, hardness, electrophilicity, and so forth. It has been shown that the toxicity values for a wide variety of polyaromatic hydrocarbons like polychlorinated biphenyls (PCBs), polychlorinated dibenzofurans (PCDFs), polychlorinated dibenzo-p-dioxins (PCDDs) and chlorophenols (CP), as well as arsenic derivatives, and several aliphatic and aromatic toxic molecules, calculated using various conceptual DFT descriptors, especially global and local electrophilicities, correlate well with their corresponding experimental toxicity values [3039]. In an earlier study, we have reported an atom counting and electrophilicity-based QSTR protocol for predicting the toxicity of aliphatic compounds towards a protozoan, Tetrahymena pyriformis [40].

In the present work, we develop QSAR models for toxicity of several classes of aliphatic compounds using quantum chemical descriptors, along with the molecular descriptor . We attempt to make a comparative evaluation of two quantum chemical parameters namely, electrophilicity index () and energy of the lowest unoccupied molecular orbital (), as useful toxicity predicting descriptors towards Tetrahymena pyriformis. We intend to check whether the electrophilicity index () is a marginally better toxicity predicting descriptor than LUMO energy when used in addition to (a hydrophobicity encoding descriptor).

2. Computational Method

All the geometries are optimized using the GAUSSIAN 03 set of codes [41]. A hybrid density functional theory, using the Becke exchange functional [42] and the correlation functional by Lee et al. [43] and 6–31G(d) basis set are used for the optimization of all the molecules studied in the present work. Frequency analysis is performed on the optimized structures at the same level of theory, and no imaginary frequencies are found. The quantum chemical descriptors such as electron affinity, ionization potential, chemical potential, hardness, and electrophilicity are calculated directly from orbital energies of the optimized geometries.

3. Theoretical Background

3.1. Quantum Chemical Descriptors

Electrophilicity index [4446] is defined () as a measure of the decrease in energy due to the maximal transfer of electrons from a donor to an acceptor system and is given as where and are the chemical potential [47] and hardness [48], respectively. Chemical potential and hardness can be expressed in terms of ionization energy () and electron affinity () as given below Using Koopmans’ approximation, and can be expressed in terms of the energies of the highest occupied () and the lowest unoccupied molecular orbital () as The condensed Fukui functions are defined as where is the associated electronic population on atom in a molecule.

The philicity at any atomic site is defined as [49] where (, and 0) represent local philic quantities describing nucleophilic, electrophilic, and radical attacks, respectively.

3.2. Regression Analysis

The regression analysis is a statistical method wherein a functional dependence of a dependent variable on a set of other independent variables is determined. In linear regression analysis, this dependence has a linear form, which can be expressed as; where are regression coefficients, is the intercept, are independent variables, and represents expected values of the dependent variable by the regression model.

The above equation represents a hyperplane in the -dimensional space, where is the number of independent variables in the equation. This regression equation can be used for predicting values of the dependent variable from the values of the independent variable.

For determining the quality of the statistical fit, the Pearson correlation coefficient () (for regression with single independent variable) or squared coefficient of determination () is used, which have the following mathematical forms where TSS is the total sum of squares, represented as and has degrees of freedom, ESS is the explained sum of squares, represented as and has degrees of freedom, and RSS is the residual sum of squares, represented as and has degrees of freedom. is the observed value of the dependent variable, is the predicted value of the dependent variable by the regression model, is the mean value of the dependent variable, is the number of observations, and is the number of independent variables included in the regression model.

If the value is greater than 0.5, the explained variance by the model (ESS) is larger than the unexplained variance (RSS). The regression equation is considered efficient when the value of is nearer to 1. The number of independent variables in the equation and the size of the data sample affect the value of. When a new variable is added to the regression equation, the value of may increase or remain same, even if the added variable does not contribute to reducing of the unexplained variance in the dependent variable. Therefore, another statistical parameter, adjusted value, is used, which is given by the equation where, is the sample size and is the number of independent variables. The value of decreases if an added variable to the equation does not reduce the unexplained variance.

The uncertainty in the model is represented as the standard error of estimate, represented by where RMS is the residual mean square. The standard error of estimate reflects the dispersion of the observed values of the dependent variables about the regression line. Larger values of mean worse statistical fit of the model and less reliability of the prediction.

The statistical significance of a regression equation can be assessed by the means of the Fisher () value where EMS is the explained mean square given as ESS/p. A regression equation is considered to be statistically significant if the observed value is greater than a tabulated value for the chosen level of significance and the corresponding degrees of freedom of . The degrees of freedom of are equal to and .

A reliable and transparent regression analysis must follow certain basic assumptions, which can be briefly enumerated as follows:(1)The response variables are not dependent on one another.(2)The relationship between the dependent and the independent variable(s) is linear.(3)The residuals (predicted minus observed values of the dependent variable) must follow the normal distribution.(4)The variance of the residuals is constant for all values of the independent variables.(5)The independent variables should not show multi-collinearity (high level of intercorrelation) and redundancy.

4. Results and Discussion

The quantum chemical descriptors like LUMO energy, HOMO energy, ionization energy, electron affinity, chemical potential, hardness, philicity, and electrophilicity of a series of aliphatic compounds, are calculated from optimized geometries, using (1)–(5) (see Table S1 and Table S2 in Supplementary Materials available online at doi: 10.1155/2010/545087). The main objective of this work is to assess the two quantum descriptors, electrophilicity index (), and LUMO energy, which are commonly employed in toxicology studies to represent molecular electrophilicities. We perform a detailed regression analysis using 13 classes of aliphatic compounds, including 171 electron acceptors and 81 electron donors, to develop some model equations, using electrophilicity index (), LUMO energy, and , to predict toxicity of such chemical compounds towards Tetrahymena pyriformis. The general regression equations obtained by using one-parameter and two-parameter models for all the aliphatic acceptors and donors are as follows.(a)For aliphatic acceptors: (b)For aliphatic donors:

The toxicity values based on these equations, along with the experimentally observed toxicity values are given in Table S3 and Table S4 (see, Supplementary Materials available online at doi: 10.1155/2010/545087). Though, the two parameter equations employing the and either of the electronic descriptors ( or ) show slightly better correlation as compared to one-parameter model, the overall toxicity predictability of these equations is poor, as is evident from values of the correlation coefficients and the calculated toxicity values. It is particularly evident that these generalized equations cannot be used as model equations for accurately predicting the toxicities of the aliphatic compounds.

In order to obtain better predictability and correlation, a stepwise regression analysis is performed by taking each class of chemical compounds separately. The experimentally observed and the calculated toxicity values (pIGC50), along with various descriptors, are presented in Tables 1 and 2 for a set of electron acceptors and a set of electron donors, respectively. The corresponding one-parameter model regression equations (, , and ) and two-parameter model regression equations ((, ) and (, )) are shown in Table 3. As is evident from Table 3, the one parameter regression equation based on alone does not show any meaningful correlation between the experiment and the calculated toxicity values. The regression equations based on show improved correlation coefficients over the equations based on for all the electron acceptors and electron donors, except for unsaturated alcohols. However, the adjusted value is less than 0.70 for diols, acetylenic alcohols, unsaturated alcohols, and amines. For all the electron donor aliphatic compounds, the values are negligible, with the exception of amino alcohols. It is remarkable to note that one-parameter regression equations obtained by using as an independent variable shows an overall sufficiently improved correlation, compared to that using the electronic descriptors like the electrophilicity () and . This result is expected since the hydrophobicity and lipophilicity of the chemical compounds mainly govern their toxic actions at cellular and molecular levels. However as a whole, the stepwise one-parameter model regression analysis based on electronic parameters or shows that neither a global electrophilicity descriptor ( or ) nor a hydrophobicity descriptor () alone is enough for modeling the toxicity of these compounds with a sufficiently high predictive power.

To improve the predictability of the regression equations and to assess the relative usefulness of the two-quantum descriptors, a two-parameter regression analysis was performed. The results indicate that there is an overall better correlation between the experimental toxicity values and the calculated values upon the addition of an electrophilicity descriptor ( or ) to a model, in conjunction with . The plots of observed toxicity values (pIGC50) versus that predicted on the basis of individual regression equations for a complete set of aliphatic acceptors and donors are presented in Figures 1 and 2. It may be noted that the calculated values of pIGC50 in Figures 1 and 2 are obtained from separate regression equations for each individual class of compounds, as reported in Table 3.

These plots reveal that the two-parameter model based on electrophilicity index () and ( for acceptors, 0.888 for donors) is marginally better than that based on and ( for acceptors, 0.842 for donors). However, the values of for individual groups of molecules while using () and as independent variables are only better for electron acceptor compounds, with the exception of halogenated alcohols, saturated alcohols, monoesters, and ketones, where the values are almost the same. In comparison to this, for electron donors the values are slightly better when a set of and values are used in the regression equation as compared to a set of electrophilicity () and , except in case of amino alcohols. The calculated toxicity values (pIGC50) along with the experimental values, for all the 13 groups of aliphatic compounds studied are reported in Tables 1 and 2.

These results suggest that electrophilicity index () is a marginally better chemical reactivity descriptor in larger cases as compared to. We may recommend the toxicity prediction using either of them along with . But, a generalized pattern to that effect needs further validation, probably by considering a wide variety of chemical toxicants. Although it is expected that a mechanistic basis of the toxic action may be envisaged from the descriptors used, one should not take the toxicity predictions based on these model relationships without a bit of caution.

As suggested by the Referee, we change the unit of IGC50 from mg/L (as used in [1, 16]) to molarity and remove all the racemates and diastereoisomers (also used in those references) from the data set. Respective regression equations are provided in Tables 4 and 5, and the plots of calculated versus observed pIGC50 values are presented in Figures 3 and 4. For the individual groups, the correlation improves in most cases. The overall correlation improves in the cases of both electron donors and acceptors, and the overall conclusion remains the same. It may be suggested that log P and should be used to predict the toxicity of various aliphatic electron donors and acceptors towards Tetrahymena pyriformis.

5. Conclusions

Toxicity of aliphatic compounds considered in this study cannot be completely explained on the basis of the hydrophobicity and the lipophilicity considerations alone. The model QSAR equations with improved toxicity predictability can be developed by taking the electrophilic property of the molecular system into consideration in addition to the hydrophobicity. The “response surface” model proposed by the earlier authors has used mostly as the global parameter for the electrophilic reactivity. The results of this study clearly show that electrophilicity index () and are equally capable of describing the contribution of toxicity of aliphatic compounds due to chemical reactivity. The electrophilicity index seems to be a marginally more efficient descriptor for the toxicity prediction as compared to . Better QSAR models are obtained by removing the racemates and the diastereoisomers from the data set and by changing the unit of IGC50 from mg/L to molarity, as suggested by the Referee.

Supporting Information Available

Quantum chemical parameters such as LUMO energy, HOMO energy, ionization energy (), electron affinity (), chemical potential (), hardness () electronegativity (), philicity () and electrophilicity () of 171 electron acceptors and 81 electron donors, and the experimental and calculated values of pIGC50 for electron acceptors and electron donors calculated on the basis of overall regression equations using (, ) and (, ).

In Table S1 contains the calculated HOMO energies, LUMO energies, ionization energies, electron affinities, electronegativities, hardness and chemical potential of 171 aliphatic electron acceptors. Table S2 describes the same as Table S1 for 81 aliphatic electron donors. Electrophilicity (), energy of lowest unoccupied molecular orbitals (), , observed and calculated values of pIGC50 for the complete set of aliphatic acceptor and donor compounds with Tetrahymena pyriformis are depicted in Tables S3 and S4.

Acknowledgments

We thank Dr. Rana Ashour for the invitation, the Referee for constructive criticism, and Professor A. Basak for helpful discussion. A. H. Pandith thanks Indian Academy of Sciences, Bangalore, for providing him a Summer Research Fellowship to visit IIT Kharagpur. We would like to thank CSIR, New Delhi, for financial support and D. R. Roy, R. Parthasarathi, B. Maiti, V. Subramanian, M. Elango, J. Padmanabhan, P. Bultink, S. Van Damme, and U. Sarkar for their help and assistance in various ways during this study.

Supplementary Materials

Table S1 contains the calculated HOMO energies, LUMO energies, ionization energies, electron affinities, electronegativities, hardness and chemical potential of 171 aliphatic electron acceptors. Table S2 describes the same as Table S1 for 81 aliphatic electron donors. Electrophilicity (ω), energy of lowest unoccupied molecular orbitals (ELUMO), log P, observed and calculated values of pIGC50 for the complete set of aliphatic acceptor and donor compounds with Tetrahymena pyriformis are depicted in Table S3 and Table S4.

  1. Supplementary Tables