Abstract

Sets of quinolizidinyl derivatives of bi- and tri-cyclic (hetero) aromatic systems were studied as selective inhibitors. On the pattern, quantitative structure-activity relationship (QSAR) study has been done on quinolizidinyl derivatives as potent inhibitors of acetylcholinesterase in alzheimer’s disease (AD). Multiple linear regression (MLR), partial least squares (PLSs), principal component regression (PCR), and least absolute shrinkage and selection operator (LASSO) were used to create QSAR models. Geometry optimization of compounds was carried out by B3LYP method employing 6–31 G basis set. HyperChem, Gaussian 98 W, and Dragon software programs were used for geometry optimization of the molecules and calculation of the quantum chemical descriptors. Finally, Unscrambler program was used for the analysis of data. In the present study, the root mean square error of the calibration and R2 using MLR method were obtained as 0.1434 and 0.95, respectively. Also, the R and R2 values were obtained as 0.79, 0.62 from stepwise MLR model. The R2 and mean square values using LASSO method were obtained as 0.766 and 3.226, respectively. The root mean square error of the calibration and R2 using PLS method were obtained as 0.3726 and 0.62, respectively. According to the obtained results, it was found that MLR model is the most favorable method in comparison with other statistical methods and is suitable for use in QSAR models.

1. Introduction

Alzheimer’s disease (AD) is a debilitating illness with unmet medical needs [1]. The number of people afflicted with the disease worldwide is expected to be triple up to the year 2050 [2]. The multifactorial pathogenesis of AD includes accumulation of aggregates of β-amyloid (Aβ) and tau protein and loss of cholinergic neurons with consequent deficit of the neurotransmitter acetylcholine (ACh) [3, 4]. In advancing AD, AChE levels in the brain are declining [5].

The well-known theory of the quantitative structure-activity relationships (QSARs) [68] is based on the hypothesis that the biological activity of a chemical compound is mainly determined by its molecular structure [6]. QSAR attempts to find consistent relationship between biological activity and molecular properties, so that these “rules” can be used to predict the activity of new compounds from their structures.

Today, QSARs are being applied in many disciplines with much emphasis on drug design. Over the years of development, many methods, algorithms, and techniques have been discovered and applied in QSAR studies [9, 10]. To date, QSARs are among the important applications of chemometric tools with the objective of development of predictive models which can be used in different areas of chemistry including medicinal, agricultural, environmental, and materials [1113].

Drug discovery often involves the use of QSAR to identify chemical structures that could have good inhibitory effects on specific targets [15]. The aim of QSAR analysis is to investigate the correlation between activity, generally, biological activity, and the physicochemical properties of a set of molecules [16].

PLS regression technique is especially useful in quite common case where the number of descriptors (independent variables) is comparable to or greater than the number of compounds (data points), and/or there exist other factors leading to correlations between variables. In this case, the solution of classical least squares problem does not exist or is unstable and unreliable. On the other hand, PLS approach leads to stable, correct, and highly predictive models even for correlated descriptors [17].

PCR is a combination of principal component analysis (PCA) and MLR. The first step in PCR is to decompose a spectral data matrix using PCA. Generally, there are two types of decomposition techniques. The first technique is by computing eigenvectors and eigenvalues. We used singular value decomposition (SVD) to decompose the spectral data matrix. This is because SVD is generally accepted as the most stable and numerically accurate technique [18, 19].

LASSO translates each coefficient by a constant factor truncating at zero. This is called soft thresholding. Best subset selection drops all variables with coefficients smaller than the largest. This is a form of hard thresholding.

2. Computational Details

The 3D structures of the molecules were drawn using the built optimum option of Hyperchem software (version 8.0). Then, the structures were fully optimized based on the ab initio method, using DFT level of theory. Hyperchem (version 3.0) and Dragon (version 3.0) programs were employed to calculate the molecular descriptors. All calculations were performed using Gaussian 98 W program series. Geometry optimization of compounds was carried out by B3LYP method employing 6–31 G basis set [20].

In this study, the independent variables were molecular descriptors, and the dependent variables were the actual half maximal inhibitory concentration (IC50) values. More than 1498 theoretical descriptors were selected and calculated. These descriptors can be classified into several groups including: (i) constitutional, (ii) topological, (iii) molecular walk counts, (iv) BCUT, (v) Galvez topological charge indices, (vi) autocorrelations, (vii) charge, (viii) aromaticity indices, (ix) randic molecular profiles, (x) geometrical, (xi) RDF, (xii) MoRSE, (xiii) WHIM, (xiv) GETAWAY, (xv) functional groups, (xvi) atom-centred, (xvii) empirical, and (xviii) properties descriptors. Finally, Unscrambler (version 9.7) program was used for analysis of data and statistical calculation.

For each compound in the training sets, the correlation equation was derived with the same descriptors. Then, the obtained equation was used to predict log (1/IC50) values for the compounds from the corresponding test sets. In the present work, the method of stepwise multiple linear regression (stepwise MLR) was used in order to select the most appropriate descriptor of all descriptors. Totally, 1498 descriptors were generated. In this study, two programs including SPSS (version 19) and Unscrambler were used for MLR, PLS, PCR, and LASSO.

3. Results and Discussions

The structures of the quinolizidinyl derivatives used in this study were shown in Table 1. Since, the variation in the chemical structure of the considered compounds is low, the selection of chemical descriptors, which can encode small variations between structures of molecules in data set, is very important. In this way, GETAWAY descriptors are very informative 3D descriptors that can encode structural features of molecules. The four most significant descriptors which were selected are as follows [14, 20]:G ( ), ARR, Te, MATS6e, Mor31m, and Mor18m.

The mean values of selected descriptors are shown in Table 2. As can be seen from this table, atomic masses and electronegativities were important descriptors in our study.

The selected descriptors through these methods were used to construct some linear models using PCR and PLS methods. Statistical parameters of different constructed QSAR models are shown in Table 3. and RMSE values for calibration in MLR method are better than the two other methods. In the present study, the root mean square error of the calibration and using MLR method were obtained as 0.1434 and 0.95, respectively.

Considering the experimental error, the overall prediction of the log (1/IC50) values was quite satisfactory. The results of MLR method were much better than the two other methods.

In the present study, linear variable selection methods were used to select the most significant descriptors (stepwise MLR) (Table 4).

The performance of the QSAR model to predict log (IC50) value was also estimated using the internal cross-validation method. The resulted predictions of the log (1/IC50) using PLS and PCR methods in gas phase were given in Table 5.

4. Conclusion

In our study, the linear methods were used to select the most significant descriptors. The stepwise MLR, MLR, PLS, and PCR were used to construct a quantitative relation between the activities of quinolizidinyl derivatives and their calculated descriptors. MLR has been successfully used for finding a QSAR model for quinolizidinyl derivatives. It provides the best results in comparison with other studied methods. Our present attempt to correlate the log (1/IC50) with theoretically calculated molecular descriptors has led to a relatively successful QSAR model that relates these derivatives. The results obtained from stepwise MLR method were suitable for drug design and classification.

Conflict of Interests

The authors declare that they have no conflict of interests.

Acknowledgment

The authors thank the Research vice Presidency of Islamic Azad University, Rasht Branch, for their encouragement, permission, and financial support.