Abstract

A series of pyridylthiazole derivatives developed by Lawrence et al. as Rho-associated protein kinase inhibitors were subjected to four-dimensional quantitative structure-activity relationship (4D-QSAR) analysis. The models were generated applying genetic algorithm (GA) optimization combined with partial least squares (PLS) regression. The best model presented validation values of , , , , , , and . Furthermore, analyzing the descriptors it was possible to propose new compounds that predicted higher inhibitory concentration values than the most active compound of the series.

1. Introduction

Rho-associated protein kinases (ROCKs) belong to Ser/Thr kinases protein family that are initially activated by the Rho family of GTPases. Two isoforms, ROCK1 and ROCK2, were identified [1]. They share significant homologous sequence in the kinase domain (90%) and in the C-terminal regulatory domains show a significant deviation [2, 3]. Both are ubiquitously expressed in several tissues of humans and rodents.

The ROCK enzymes have been implicated in a variety of therapeutic areas including cardiovascular diseases [4], central nervous system disorders [5], inflammation [6], and cancer [7]. Abnormal activation of the Rho/Rho kinase has shown itself to be important in the activation of pathways that contribute to the pathophysiology of certain diseases, such as arterial and pulmonary hypertension, glaucoma, diabetes, erectile dysfunction, neurodegeneration, and cancer [8, 9]. Thus, the ROCK inhibition is a promising strategy to prevent the invasion of malignant cells, a central event in the metastatic process [1012], besides enabling a future target for treatment of various diseases.

Y-27632 and Y-30142 (Figure 1) are known as highly selective Rho-kinase inhibitors, as compared to other kinases [13]. They are permeable in the cell and act by competing with ATP for the kinase activation site having an inhibitory action in its two isoforms: ROCK1 and ROCK2 [14]. these components inhibits smooth muscle contraction and shows effectiveness in normalizing the blood pressure in hypertension models. Compound HA-1077, also known as fasudil hydrochloride or AT877 (Figure 1), is reported as an inhibitor of several protein kinases including Rho kinase with a similar affinity for the Y-27632, but with lower selectivity [15], and is currently the only Rho kinase inhibitor available for clinical use. Since 1995 it has been used in Japan for vasospasm prevention in patients with subarachnoid hemorrhage [16]. CID5056270 (Figure 1) has also been reported to potently inhibit ROCK enzymatic activity [17].

A useful tool to develop new prototype compounds for inhibition/activation of the Rho/ROCK is the study of quantitative relationships between chemical structure and biological activity or some physicochemical property (QSAR/QSPR). The QSAR aims to develop logical mathematical models with a response of the biological property by quantifying of the chemical information. Subtle changes in the structure can cause changes in its pharmacological/toxicological activity [18]. It may be used to understand and explain the driving forces behind the drug action at the molecular level and it allows the design and development of new compounds with desirable biological properties [19]. Using QSAR it is possible to reduce the time and financial cost for the development of new structures and improvement of their effectiveness and selectivity, in addition to reducing the number of experiments conducted on animals [18, 20]. Therefore, the main purpose of this study is to use the 4D-QSAR method to propose structural changes in pyridylthiazole derivatives [21] to make them future candidates for ROCK1 inhibitors.

2. Materials and Methods

2.1. Biological Data

A series of 51 pyridylthiazole analogous compounds have been taken from the report of Lawrence et al. [21] (Figure 6). They were selected because of their ROCK1 isoform inhibition. The IC50 (half maximal inhibitory concentration) values were converted into molar units and then expressed as (pIC50). Among the 51 compounds, 9 were chosen to compose the test set. Three randomly selected test groups were formed in order to obtain greater credibility for the model. Test Group I contains molecules 1, 22, 30, 31, 35, 38, 39, 41, and 43; in Test Group II molecules were 3, 5, 6, 14, 17, 18, 28, 44, and 51; and Test Group III includes molecules 9, 12, 15, 21, 32, 37, 48, 50, and 51.

2.2. Crystal Structure of ROCK1

The X-ray crystallographic structure coordinates of ROCK1 enzyme (EC 2.7.11.1) with resolution of 2.75 Ǻ, in complex with 1-[(1R)-1-(3-methoxyphenyl)ethyl]-3-(4-pyridin-4-yl-1,3-thiazol-2-yl)urea [22], were obtained from the Protein Data Bank (PDB ID code: 3TV7) [23]. The catalytic site was used for modeling the training and test compounds (Figure 6). After that, the complexes were optimized through molecular mechanics, using the CHARMM force field [24] available in the Discovery Studio software [25] used for all the steps above. The enzyme was cut into a distance of 10 Ǻ around the ligand to reduce computational requirements, and hydrogen atoms were added to the structure in order to maintain the geometric integrity of the receptor.

2.3. Molecular Dynamics Simulation

The optimized complexes were subjected to the molecular dynamics simulation (MDS) process with the aim of generating a conformational ensemble profile (CEP) for each complex and thereby investigating the conformational flexibility [26, 27]. The temperature for dynamics was adjusted to 300 K in order to stay near the temperature used in biological assay; the simulation sampling time was 50 ps with 0.001 ps intervals. A distance dependent dielectric function was also applied in order to try to model the explicit solvent effect. In addition, all corresponding Cα atoms were fixed to prevent a large conformational change of the ligand, which could result in sterically prohibited conformations.

2.4. Alignment and Interaction Pharmacophore Elements Definition

Ten alignments were performed using the ordinate belonging to amino acid residues in the protein catalytic site and the ligand: Ala103, Lys105, and S25; S25, N21, and Ala103; Ala103, Met153, and S25; (4) Ala103, S25, and Lys105; (5) S25, Ala103, and Lys105; (6) Lys105, Ala103, and S25; (7) S25, Lys105, and Ala103; (8) Lys105, S25, and Ala103; (9) S25, Ala103, and N21; (10) Ala103, S25, and Met153 (Figure 2). Each conformation from the CEP of each complex was placed in a reference cubic cell space according to the trial alignment under consideration.

The atoms of each compound were classified into six types of interaction pharmacophore element (IPE) corresponding to the kinds of atoms that can occupy each cell box in accordance with the 4D-QSAR methodology [27]. IPEs are classified into the following: nonpolar (np); positively charged polar (p+); negatively charged polar (p−); hydrogen bond acceptors (ha); hydrogen bond donors (hd); and aromatic systems (ar). The grid cell occupancy profiles for each of the chosen IPEs were computed and used as the basis set of trial cell occupation descriptors (GCOD, Grid Cell Occupancy Descriptor) that will be used in the construction of QSAR models. The size of the grid cell was set to 2 Å which corresponds to the whole number nearest to twice the van der Waals radius of a hydrogen atom ( Å).

2.5. Obtaining the 4D-QSAR Models

In order to avoid noise or useless data, databases named DB1, DB2, and DB3 were generated. DB1 databases were built excluding variables where GCODs were equal to zero for all molecules. DB2 and DB3 databases were built excluding variables where variance values were less than 0.1 and 1.0, respectively. The GCODs were used to form the trial basis set for the Genetic Function Approximation (GFA) analysis [28] using Wolf 6.2 software [29].

2.6. Validation of the Models

QSAR modeling produces predictive models derived from application of statistical tools correlating biological activity with descriptors representative of molecular structure or properties. For validation of QSAR models, the following strategies were adopted:(1)Internal validation using leave-one-out cross-validation ( or ) [30].(2)Adjusted : models with greater than 0.5 are considered good predictive models [31].(3)External validation using the value.(4)Modified equation for giving importance to the difference between and (observed and predicted values with the zero axis intersection). Values greater than 0.5 can be a good indicator of a good external predictability [32]:(5): which are calculated based on the correlation of the observed (-axis) and predicted (-axis) response data with and without the intercept and also by interchanging the axes [33]. Change of the - and -axes gives the value of . Values lower than 0.2 can be a good indicator of a good external predictability [33](6)-randomization: which consists of the exchange of the values of the independent variables randomly. For an acceptable QSAR model value must be less than the correlation coefficient of the nonrandomized design.(7): which penalizes the model for the difference between squared mean correlation coefficient () of randomized models and squared correlation coefficient () of the nonrandomized model [32]

3. Results and Discussion

Data Set 1 was performed with Test Group I (molecules 1, 22, 30, 31, 35, 38, 39, 41, and 43) (Figure 6) and the statistical parameters are shown in Table 1. All the alignments have and value greater than 0.7 and 0.5, respectively, which are acceptable parameter values for a good QSAR model. Analyzing the values, the alignment Ali3_0.01_1 did not submit a minimum acceptable value for this parameter, 0.5; thus it was excluded. Another parameter used in the validation of the models was and according to the results obtained in this validation, only the alignment Ali1_0.01_1 features a value ≥0.5, and therefore the other alignments were not used.

The Data Set 2 was performed with the Test Group II (3, 5, 6, 14, 17, 18, 28, 44, and 51) (Figure 6) and the statistical parameters are shown in Table 2.

All the alignments have and value greater than 0.7 and 0.5, respectively; however the parameter value was less than 0.5 for Ali3_0.1_2, Ali4_0.1_2, Ali5_0.01_2, Ali5_0.1_2, Ali6_0.1_2, and Ali8_0.1_2. Observing values, it was possible to exclude the alignments Ali2_0.01_2, Ali2_0.1_2, Ali3_0.01_2, Ali4_0.01_2, Ali6_0.01_2, Ali7_0.01_2, Ali7_0.1_2, Ali8_0.01_2, Ali9_0.01_2, Ali10_0.01_2, and Ali10_0.1_2. The value is dependent on the average of the training group; to avoid overestimation, value calculations were performed, which is an additional validation for the test group. In this validation, only the alignments Ali1_0.1_2 and Ali9_0.1_2 showed satisfactory results (≥0.5) and therefore only these two alignments were considered.

Data Set 3 was performed with Test Group III (9, 12, 15, 21, 32, 37, 48, 50, and 51) (Figure 6) and the statistical parameters are shown in Table 3. and values are within the permitted, but values ≥0.5 were not found for alignments Ali3_0.01_3, Ali4_0.1_3, Ali5_0.1_3, and Ali8_0.1_3, which caused their exclusion. The values of the parameter show that only the alignments Ali1_0.1_3, Ali5_0.01_3, and Ali10_0.1_3 remained ≥0.5, which is the minimum acceptable for validation. Only alignments Ali5_0.01_3 and Ali10_0.1_3 showed acceptable and values.

After preliminary analysis of the validations for all data sets, only alignments Ali1_0.01_1, Ali1_0.1_2, Ali5_0.01_3, Ali9_0.1_2, and Ali10_0.1_3 remained to build the model.

Analyzing parameters, it is observed that all the alignments were satisfactory because all were lower than the respective . As for the parameter, only the values for alignments Ali5_0.01_3 and Ali9_0.1_2 are within the ideal range ≥0.5. Comparing the number of outlier compounds among the alignments it was found that Ali9_0.1_2 presented three compounds and Ali5_0.01_3 did not show any outlier compound. For this reason Ali5_0.01_3 was chosen for further analysis.

3.1. Model Ali5_0.01_3 Analysis

The QSAR model for the alignment Ali5_0.01_3 was defined by where , , , , , , , , , , , and .

Equation (4) was used for calculating the pIC50 values of the set data and the results are plotted in Figure 3. Leave-one-out cross-validation analysis of the model showed a value of 0.672 with standard error of 0.447. This means that the model has a predictive capacity of 67%. LOO-CV correlation coefficient values over 0.5 reveal that the model is a useful tool for predicting affinities for new compounds in this set. The optimum number of latent variables (PLS components) used for further analysis was seven.

In order to better understand the behavior of the adjusted data for the model, the cross-correlation matrix between different GCODs in the model was calculated [34]. There is no correlation ( > 0.7) between the GCODs pairs.

Figure 4 shows William’s plot [35] for the model Ali5_0.01_3. As it can be observed, both the training and test set compounds lie in the applicability domain of the model, demonstrating the reliability of the model.

The best model of alignment Ali5_0.01_3 generated 7 descriptors, five GCODs with positive coefficients (1) , (2) , (3) , (4) , and (5) , and corresponded to favorable interactions between the compounds substituents and the amino acid residues on the enzyme catalytic site. Thus, substituents with occupancy frequency of these IPE types inside the grid cells increase the activity. The GCODs (6) and (7) had negative coefficients corresponding to unfavorable interactions between the substituents of the compounds and ROCK amino acid residues. The numbers in parentheses indicate the cartesian coordinates , , and of the cell box and the letters in brackets indicate the type of IPE of the atoms occupying the cell.

A graphic representation of the 3D pharmacophore embedded in 4D-QSAR model is shown in Figure 5 using compound 18 as reference. GCODs were labeled as () which means the cartesian coordinate positions of the selected grid cell () and the respective interaction types (IPE). GCOD () shows aromatic IPE type influencing the increased activity of compounds. This descriptor is located close to Phe87 and Lys105 favoring interactions. GCOD shows nonpolar IPE type and is situated between the hydrogen atom of the phenyl ring and the amino acid residue Val90 favoring van der Waals interactions. GCOD is related to nonpolar IPE and is located close to amino acid residue Gly85. During the MDS only compounds 24, 30, and 38 had the hydrogen atom of the phenyl ring with high frequency of occupation in this descriptor. Therefore, the position of the GCOD was not helpful in this study. GCOD is related to the hydrogen bond acceptor IPE type and shows some frequency of occupancy for all compounds because it is positioned close to the canbonyl group (urealinkage). This suggests the importance of the intermolecular hydrogen bond with Lys105. This observation also was made by Pireddu et al. [22]. GCOD is related to nonpolar IPE and also shows some frequency of occupancy for all compounds because it is positioned close to the 4-(4-pyridinyl)-2-thiazolyl group, being in all of them. This region is close to Met156, Leu205, and Ala215 favoring van der Waals interactions. In addition it is possible to observe hydrogen bond interaction between the NH backbone of the Met156 and the nitrogen atom of the pyridyl ring.

The GCOD () is related to aromatic IPE and has a negative coefficient indicating that the occupation of the grid cell by the aromatic ring results in a decrease in potency of the compounds. This descriptor is situated between Arg84, Lys200, and Asp202 with highest frequency of occupancy for compounds 17, 25, and 27. This descriptor can explain the difference in the activity of compounds 17-18 and 26-27, depending on the chiral center. When the center is in position, the substituents occupied by GCOD () generate a steric hindrance. This limitation also occurs when one of the compound 25 phenyl rings occupies this position. The GCOD () is related to nonpolar IPE and presents the most negative coefficient causing great influence on the activity of the compounds. This descriptor showed higher occupation frequency for compounds 28, 34, 35, and 36, respectively, and it is located close to the substituent attached to the benzene ring. This region is at the opening of the entrance of the ROCK1 active site near the part where there will be solvent. This indicates that nonpolar substituents should be avoided in this position and polar substituents should be explored.

Analyzing the results obtained from the 4D-QSAR study and experimental data, new structures were proposed in order to increase the activity of these compounds. The bioactivities of the proposals were made through a virtual test of activity in which we applied (4) of Model 1 from Ali5_0.01_3. The predicted activity values (Figure 7) were obtained. The structures were constructed using molecule 18 as a base, which is the most active of the series.

P1, P2, and P3 proposed structures showed a predicted biological activity greater than the experimental activity of compound 18, which is the most active compound of the studied series. Despite having a high predicted activity value P3 and P4 did not exceed the value of the compound 18. The compounds of Figure 7 were submitted to evaluation using Lipinski rule of five [36]. It is a rule to evaluate if a chemical compound with a certain pharmacological or biological activity has properties that would make it a likely orally active drug in humans. The objective is to estimate the solubility and the permeability of drugs administered orally, predicting the influence of the chemical structure on the absorption of a compound. The Lipinski rule of five [36] states that most “drug-like” molecules have , molecular weight (MW) ≤ 500, number of hydrogen bond acceptors () ≤ 10, and number of hydrogen bond donors () ≤ 5 and polar surface area (TPSA) no greater than 140 Å2. Molecules violating more than one of these rules may have problems with bioavailability. The proposed molecules have been designed in the Molinspiration Online Property Calculation Software Toolkit (http://www.molinspiration.com/) to evaluate the criteria discussed above. The results (Table 4) are encouraging for the proposed compounds because all proposed molecules are under the conditions stipulated by Lipinski, indicating no absorption problems.

4. Conclusion

In this study a series of pyridylthiazole inhibitors for ROCK1 were evaluated through RD-4D-QSAR. The best models were obtained from the evaluation of the training set of 44 compounds and 3 test groups of 9 compounds with ten alignments for each combination. The best model was obtained from Ali5_0.01_3, in which the grid cell size employed was 2 Å and validation values , , and were obtained, which confirm the sturdiness of the model. Moreover, based on the evaluation of GCODs, 5 structures were proposed using the most active compound of the series, 3 of which (P1, P2, and P3) showed predicted biological activity values higher than the base compound. These compounds were evaluated by Lipinski’s rule and did not commit any violation, indicating that they are good candidates for synthesis and pharmacological evaluation.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors thank the Brazilian agencies “Conselho Nacional de Desenvolvimento Científico e Tecnológico” (CNPq, Brazil), “Coordenação de Aperfeiçoamento de Pessoal de Nível Superior” (CAPES), and “Fundação de Amparo à Pesquisa do Estado de Minas Gerais” (FAPEMIG, Brazil) for their support. They thank A. J. Hopfinger who kindly supplied the 4D-QSAR program for academic use.