Abstract

One hundred ten compounds of diverse structures (actives and excipients used in pharmaceutical preparations) were studied by RP-18 HPLC with acetonitrile-pH 7.4 phosphate buffer 1 : 1 (v/v) as the mobile phase. The relationships between the BBB permeation coefficients and the chromatographic parameters log k and (log k)/PSA were compared to those between the blood-brain barrier (BBB) permeation parameters and the RP-18 TLC descriptors Rf and Rf/PSA known from our earlier studies. It was found that the correlations between the BBB permeability and the HPLC data are slightly worse than those achieved for the thin-layer chromatographic data. MLR analysis based upon the physicochemical data confirmed the value of the molecular descriptors, related to the CNS bioavailability. These variables, combined with the HPLC data, made it possible to generate computational models, explaining 70–96% of the total variance of the CNS bioavailability. Contrary to TLC Rf, the advantage of the modification of HPLC log k with PSA (polar surface area) has not been confirmed and the results obtained with log k are superior to those obtained after a novel (log k)/PSA parameter has been introduced. Establishing a firm threshold limit of (log k)/PSA, log k, or even k and k/PSA to distinguish between the CNS+ and CNS− compounds was impossible. On the other hand, discriminant function analyses involving log k and (log k)/PSA as discriminating variables separated the CNS+ and CNS− compounds with the success rate ca. 90%. On the basis of these results, it was concluded that the RP-18 HPLC analytical models are entirely successful in studies and predictions of the BBB permeability.

1. Introduction

The blood-brain barrier (BBB) is a static anatomic barrier but also a dynamic barrier in which protein transporters of efflux and influx are active. The most important transporters are glycoprotein P (P-gp) and organic anions transporting polypeptides (OATP family) [1, 2]. The BBB maintains the brain homeostasis, reduces its penetration by endogenic compounds, and protects it against the access of xenobiotics [3]. Due to the existence of this barrier, the search for new, potential drugs, whose biological target is in the central nervous system (CNS), is a particularly difficult challenge [4]. The BBB penetration is also important because of the brain-related side effects of peripherally acting drugs [5]. Evaluation of the ability of a compound to cross the BBB (blood-brain barrier permeability (BBBp)) is therefore desirable at different stages of drug design and testing.

The compounds that reach the CNS are defined as BBB+ and the molecules of the limited CNS availability as BBB−. Classification of the BBB+/BBB− type may be based upon the values of log BB (with BB defined as the ratio of the drug concentration in the brain to that in blood in the state of equilibrium). Biological experiments capable of determination of drugs’ distribution between blood and brain are the best source of information. They are, however, tedious, costly, and not readily available as extensive screening tests of structure libraries [6]. Access to the BBBp availability data may be via in vivo experiments on rats involving measurements of drugs’ brain (B) and plasma (P) partitioning. The resulting parameter is log B/P [7].

The parameter based on kinetic studies of the BBB penetration in vivo, after intravenous administration, is usually expressed as log PS, where P (cm s−1) is the measure of the observed BBB permeability and S (cm2/g) is the surface area of vascular endothelium. PS in the physical meaning is the constant of one-way flux level (Kin), corrected with the brain flux value [8]. In vivo experiments related to kinetic phenomena are very difficult to compare to the thermodynamic partitioning, this being the reason why the log PS parameter is rarely encountered in studies of the BBBp prediction. The usability of the log BB parameter and its advantage over other BBBp descriptors contains in the ease of its estimation compared to the kinetic parameters of the BBB permeation [9]. However, conclusions on the basis of log BB should be drawn with some caution since this parameter as a sole predictor of the ability of a compound to penetrate the brain may lead to false results. Many CNS-active agents undergo P-gp-mediated transport that affects brain concentrations [10, 11]. The threshold value of log BB may be determined, e.g., by picking CNS-active (CNS+) and CNS-inactive (CNS−) drugs as an equivalent to the BBB+/BBB− classification [12]. Although this threshold has made it possible to evaluate ca. 1700 compounds with a high success rate, such classification is not entirely errorproof. The line drawn between CNS+ and CNS− type drugs is not always the same as between these that cross and do not cross the BBB. The CNS-inactive drugs may be either compounds that enter the brain but do not have proper biological targets there or substances that do not penetrate the BBB [12, 13]. It appears much more reliable to specify the BBB+ drugs in the CNS+ group than to identify the BBB− compounds on the basis of the lack of their CNS activity [12, 14].

Threshold values of the log BB parameter are different for many proposed models of the BBBp prediction. For example, it was established in [15] that the BBB+ compounds are those with the log BB value from 0.0 to −0.3 and the BBB− compounds have log BB lower than −0.3. According to other studies, drugs can be considered CNS+ if their log BB ≥ 0.7. The threshold value log BB = −1 has also been proposed. Both values were used for classification of over 400 and 2000 compounds, respectively [14, 16]. An opinion was also reported that log BB = −0.52, corresponding to 30% ratio of brain and plasma concentrations, may logically distinguish between BBB+ and BBB− [17]. The optimum threshold value is usually 0 to −1 [6, 10, 1720]. In our earlier studies, we considered the compounds with log BB ≥ −0.9 to be BBB+ [2123].

Over the last 20 years, several different methods to determine the CNS bioavailability have been proposed based upon log BB and factors that influence this value. The success rate of determinations using these models is usually 75–99% [14]. In studies of the BBB permeability, an in silico approach prevails. The existing methods of bioavailability determination include practical, experience-based rules (“rules of thumb”), classification methods and approaches based on quantitative relationship between the structure, and activity of selected cases (QSAR) [17].

The practical approach by the “rules of thumb” involves selecting the values of physicochemical parameters of solutes, influencing their ability to cross the BBB [13, 18, 2426]. These studies have led to general rules and guidelines regarding simple physicochemical features of compounds that facilitate their BBB permeation. It has been established that the CNS active compounds tend to be lipophilic, have a limited number of H-bond donors, low degree of ionization, and especially anionic and low polar surface area compared to the total molecular surface [25].

Another group of analyses are classification models, based on a vast amount of information on the BBB+ and BBB− parameters for compounds paired with their molecular descriptors [12, 14, 18]. Published studies involve different calculation methods. However, this approach has a serious disadvantage since it is anticipated that the activity of compounds towards the biological targets located within the CNS is equivalent to their ability to cross the BBB. Nevertheless, the knowledge of the probability of crossing the BBB gained in these studies is very useful in other investigations [17].

Passive diffusion through the BBB is one of the most frequently investigated pharmacokinetic processes. One can say that a low passive diffusion level through the BBB (constant log PS) makes it possible to identify drugs incapable of entering the brain due to their low membrane permeation [8]. On the other hand, log BB describes the real access of a drug to the brain in the state of dynamic equilibrium. So, only both parameters taken together offer the true possibility of rational investigations of drug access to the brain [27].

Although many types of descriptors are used to model the BBB permeation, four are preferred and appear in many acceptable models. The key molecular descriptors are PSA (polar surface area), log P (octanol/water partitioning coefficient), MW (molecular weight), and HD and HA (the number of potential donors/acceptors of a hydrogen bond). Many studies point to a few basic features that a molecule must have to cross the BBB with ease: molecular weight (MW) lower than 500 Da; lipophilicity (log P) 2 to 3; nonionic form at physiological pH [28].

In 1994, the so-called solvation parameters were introduced to the BBBp analysis [2931]. This group comprises the features such as polarizability (α), number of hydrogen bond donors (HD), number of hydrogen bond acceptors, (HA), molecular volume (V), and molar refraction (MR) that have soon become the classic group of Abraham descriptors.

At some earlier stage, the same variables made it possible to successfully distinguish between benzodiazepine derivatives (211 compounds) belonging to 4 groups differing in their site of action. Biological targets within the CNS or outside of it were the basis for full discrimination of these groups [32]. The possibility of predicting the BBB bioavailability even for the structurally similar compounds helps to study and avoid some unwanted side effects. In the case of predictions based solely on physicochemical properties, it may be the rationale behind the elimination of selected structures from a synthesis plan.

The BBB permeability may be reflected by column and thin-layer chromatographic data (retention factors k, log k, Rf, and RM, retention time tk, and retention volume Vk) [3341]. Our earlier experiments [2123] proved a close relationship between the RP-18 thin later chromatographic data and the CNS bioavailability. We proposed computational models of the BBB permeability based on multiple regression analysis (MR) and discriminant function analysis (DFA), involving thin-layer chromatographic descriptors Rf and a novel parameter Rf/PSA, supported by a set of physicochemical data. It was confirmed that the generated models may be successfully applied to compounds of different structures and increasing the number of cases did not diminish the predictive ability of the models, but on the contrary, in some situations, it increased it. In total, a group of 154 structurally diverse molecules was investigated; this group included the set of 110 currently investigated structures.

The log BB (B2) descriptors and chromatographic data (Rf/PSA) exhibited a significant stability of the threshold limit between the CNS+ and CNS− compounds in all our experiments, involving a group of CNS+ drugs [21], CNS− inactive sunscreens, and excipients present in pharmaceutical and cosmetic preparations [22] and a large group of compounds of diverse structures and site of action [23].

Having obtained such favorable results of the investigations involving the simple RP-18 TLC stationary phase, we wished to test the applicability of the RP-18 HPLC data obtained using the phase system as described earlier for the TLC models [2123] in order to compare the applicability of the HPLC and TLC data to the BBB permeability modeling.

Our current HPLC analysis has involved 110 compounds used in pharmaceutical preparations (actives or excipients). It was our objective to select the model of the greatest predictive ability and, at the same time, as rapid, low-cost, and resistant to the structural diversity of analytes as possible. The number of cases in our current study has been reduced to 110 because of the limited availability of information on the CNS activity of some analytes (in our previous report [23], 43 cases were not defined as CNS±).

2. Experimental

2.1. Materials

One hundred ten drugs analyzed during these investigations (Figure 1, Supplementary Materials) were isolated from pharmaceutical preparations (1, 4, 5, 79, 1120, 22, 23, 25, 2645, 4862, 6590, 99, and 104110), purchased from Sigma-Aldrich (2, 3, 6, 10, 21, 24, 46-47, and 93), or donated as free samples by CIBA (92), Polfa Pabianice (94-95 and 100103), BASF (63, 97, and 98), and Merck (96). The purity of drugs isolated from pharmaceutical preparations was assessed by high-performance liquid chromatography (Section 2.2). All isolated drugs were used without further purification. Drugs purchased from Sigma-Aldrich were of analytical or pharmacopeial grade. Distilled water used for chromatography was from an in-house distillation apparatus. Analytical-grade acetonitrile and methanol were from Avantor Performance Materials (formerly Polskie Odczynniki Chemiczne). pH 7.4 phosphate-buffered saline was from Sigma-Aldrich. All of the analyzed compounds were investigated chromatographically and presented in previously defined populations [2123]. The chromatographic data and physicochemical descriptors of these compounds were introduced into the analysis. The structure and activity of compounds used in our experiments are random. We have anticipated a large variability of these features among the studied compounds. This made it possible to search for a universal model regarding the studied cases. To achieve this, we have studied compounds of diverse structures from salicylic acid to clarithromycin and of different CNS activities from psychoactive drugs to excipients.

2.2. Chromatography

HPLC was performed with the Perkin-Elmer series 200 HPLC apparatus equipped with a UV detector (210 nm) and the LiChospher 100 RP-18 (5 µm) column, with acetonitrile-pH 7.4 phosphate buffer 1 : 1 (v/v) and flow rate 1 mL·min−1. All compounds were injected as 0.1 mg·mL−1 solutions in methanol (injection volume–1 µL). All chromatograms were repeated in duplicate, and the mean k values were used in further investigations.

2.3. QSAR Analysis

The physicochemical properties of compounds 1110 were calculated earlier; we used these data in our previous studies involving the same cases. The CNS± variable was added for some new cases that have recently appeared in the DrugBank database [39]. The molecular descriptors listed in Table 1 (Supplementary Materials) were limited to the group selected in References [2123], linked to the BBB permeation.

The molecular descriptors for the compounds investigated during this study were calculated with HyperChem 7.0 [40], utilizing the PM3 semiempirical method with Polak–Ribiere algorithm [41] (total dipole moment (DM (D)), van der Waals molar volume (V (V/100) (Å3)), energy of the highest occupied molecular orbital (eH (eV)), and energy of the lowest unoccupied molecular orbital (eL (eL × 10) (eV))). The distribution coefficient (log D), PSA (Å2), the number of H-bond donors (HD), and the number of H-bond acceptors (HA) were calculated using ACD/Labs 8.0 software [42]. The theoretical values describing BBB permeability were calculated as B2 (log BB = 0.547–0.016 PSA) [43]. The experimental BBB permeability (BBvivo) values and CNS+/CNS binary BBB bioavailability scores were taken from the literature sources [19, 42]. The chromatographic data and molecular descriptors for compounds 1110 are presented in Table 1 (Supplementary Materials).

2.4. Statistical Analysis

One hundred ten compounds analyzed during these investigations were divided into two subsets: the training set (compounds with the known experimental BBB permeability (BBvivo), cases 140) and the training set (compounds without the known experimental BBB permeability (BBvivo), cases 41110).

2.4.1. Stepwise Multiple Regression Analysis

The physicochemical parameters related to the compounds’ BBB permeability were previously determined [23] by the use of MR analysis. The stepwise multiple regression analysis and the correlation analysis were carried out using Statistica 13 [44]. The physicochemical data have only been used in the stepwise MR analysis in combination with the chromatographic data. The QSAR analysis based solely on molecular descriptors has not been repeated. The results were taken from Reference [23] and used in the further chemometric analysis. The values of the BBB permeability (BBvivo), determined for 40 cases, and the calculated values of log BB (B2) of 110 analyzed compounds were used as dependent variables and the other physicochemical molecular descriptors and chromatographic data as independent variables. Only physicochemical data, calculated for the known structures, are available for all 110 cases. Other parameters, such as CNS± bioavailability, are within this group limited to 101 cases, this being the reason for a different number of cases in different analyses.

The statistical significance ( level) of a result as an estimated measure of the degree to which it represents the population was determined as . The correlation matrix was used to correlate the biological activities with the different variables. If two independent variables showed a correlation greater than R2 > 0.4, one of them was removed.

Validation of the correlation models was carried out by the general internal cross-validation procedures: “leave-one-out” (LOO) and “leave-many-out” (LMO). In the LOO approach, one element is removed from the whole data set and used to verify the model generated with the remaining n − 1 elements; the procedure is then repeated with another element. In the LMO method, the data set is repeatedly divided into two subsets used for model generation and its verification, respectively. The predictive power of the developed models was evaluated using the following indicators: cross-validated squared correlation coefficient (), predicted residual sum of squares (PRESS), standard deviation based on PRESS (SPRESS), and standard deviation of error of prediction (SDEP). The LMO cross-validation was applied by deleting 25% of the compounds in four cycles and predicting the BBB permeability of compounds deleted in each cycle from the corresponding equations derived from the reduced data set. Some criteria for the reliability prediction and robustness of the models are suggested in References [4548]: and ; and .

2.4.2. Discriminant Function Analysis (DFA)

Investigations of the CNS activity of the drugs analyzed throughout this study were based on the discriminant function analysis (DFA) using the physicochemical and chromatographic data connected with the BBB permeability and selected by MLR analysis. All results were compared with the models obtained and tested in the previous investigations [21, 22]. In this DFA, 101 structurally different compounds were assigned to the CNS+ or CNS− group of activity (defined according to Reference [39], the remaining 9 compounds are not defined as CNS±). The classification functions were determined and validated for the compounds with the experimentally obtained BBB permeability (40 6−/34+ cases) and for compounds with calculated BBB permeability (B2) (101 26−/75+ cases).

Discriminant function analysis is a multivariate technique that has two purposes: to separate cases from distinct populations and to allocate new cases into previously defined populations [49]. The DFA was performed by STATISTICA 13.1 [44] software. In all subsequently performed analyses, the stepwise method was applied. The model was formed by introducing subsequent variables that mostly contributed to group discrimination. After introducing sufficient grouping variables to the model (i.e., after obtaining the maximum probability of a priori classification), discriminant functions (roots) discriminating the activity groups were calculated. The maximum number of functions will be equal to the number of groups minus one or the number of variables in the analysis, whichever is less. The quality of the discriminant function was evaluated by Wilks’ lambda parameter, which is a multivariate analysis of variance statistics that tests the quality of group means for the variable(s) in the discriminant function [49]. Wilks’ lambda can assume values in the range of 0 (perfect discrimination) to 1 (no discrimination), and the statistical significance of roots (discriminant functions) used for interpretation was established on the basis of χ2 tests of subsequent roots. Using statistically significant discriminant functions as the basis, canonical values were determined for the particular grouping variables. The scatter diagrams of the canonical values of the subsequent cases for the first two roots determined in the course of the analysis cannot be drawn to evaluate the discriminant power of the obtained models because there are only two discriminated groups. The final phase of the qualitative analysis of the compounds was to determine the classification functions for each activity group. After calculation of the classification scores for a case, it is easy to decide how to classify it: in general, we assign a case to a group for which it has the highest classification score. The tool used to determine how well the classification functions predict the group membership of cases is a classification matrix. The classification matrix shows the number of cases that were correctly classified (on the diagonal of the matrix) and those that were incorrectly classified.

The obtained discriminant models were evaluated by classification of 61 cases not included in the model (test set 4172, 7492, 9698, 100104, 108, and 110) with the known CNS± activity [42]. The values of the more important variables obtained with the DFA methodology were calculated for the test set. Then, these values were introduced into the discriminant functions (7) and (8) obtained in validated DFA. We classify the case as belonging to the group for which it has the highest classification score. The new compounds were assigned to the CNS+ or CNS− group of activity.

3. Results and Discussion

3.1. Stepwise Multiple Regression Analysis

In our previous investigations [2123], it was proved that computational models can be generated to predict the bioavailability of solutes to the CNS. Our models of outstanding predictive values were based on the chromatographic data obtained by TLC on RP-18 stationary phase with acetonitrile-pH 7.4 phosphate buffer 70 : 30 (v/v) as the mobile phase. HPLC is one of the most useful analytical methods of studying actives and excipients present in pharmaceutical preparations. RP-18 chromatographic columns are very popular in these analyses because of their applicability across a broad range of a structural diversity of compounds and resistance to different components of mobile phases. The comparison of the efficiency of the CNS bioavailability models involving the TLC and HPLC data was based upon the results of experiments with matching stationary/mobile phases.

At first, the correlations of the BBB permeability parameters with the chromatographic data log k and (log k)/PSA were studied for the cases with BBvivo determined experimentally. It was possible to use both independent variables log k and (log k)/PSA in a single equation due to their acceptable correlation level 0.64. The resulting model for BBvivo, with parameters R = 0.41, R2 = 0.17, and n = 40, is not statistically significant. The same parameters R = 0.41, R2 = 0.17, and n = 40 were obtained for a model with one independent variable (log k). This model is statistically significant. The correlation between BBvivo and the HPLC data is at a very similar level to that for the RP-18 TLC experiment involving Rf and Rf/PSA parameters (R = 0.38; R2 = 0.15; n = 46) [23].

The correlation of the same chromatographic data with the computed permeability parameter B2 used in our earlier investigations has also been considered. This part of our study involved only the cases with the established BBvivo value (140). The course of the analysis was the same as described above. Using both independent variables in a single regression equation for B2, we have generated a model of the following parameters R = 0.34, R2 = 0.11, and n = 40 that is not statistically significant. An equation including one independent value (log k) had comparable parameters: R = 0.32; R2 = 0.10; n = 40. This model is statistically significant, but the outcome is, of course, far from satisfying. The correlation between B2 and the HPLC data is at a significantly lower level than that involving the TLC data, Rf and Rf/PSA parameters (R = 0.67; R2 = 0.44; n = 46) [23].

In our earlier research [22, 23], we successfully proposed the B2 descriptor as a readily available measure of the BBB bioavailability useful in the case of analytes without the known BBvivo value. B2 was therefore used as a dependent variable in the analysis of applicability of the TLC data for large groups of cases. The correlation of the HPLC data with the previously used, computed B2 permeability parameter was investigated for the whole group of compounds 1110. The analysis was very similar. When both uncorrelated independent variables, log k and (log k)/PSA, were used in a single regression equation, we obtained a model of the following parameters: R = 0.43; R2 = 0.19; n = 110, with the statistically insignificant variable (log k)/PSA. The equation of the comparable parameters (R = 0.43, R2 = 0.18, and n = 110) was generated for a model with one independent variable (log k). This model is statistically significant, and the result is better than that for the group of compounds 140. Comparison with the correlation between B2 and the variables Rf and Rf/PSA has once again proved the advantage of the TLC model [23]. Both independent variables were statistically significant, and the model explains over 30% of the total B2 variance in the group (R = 0.55; R2 = 0.30; n = 154) [23].

Studying the group of compounds 140 with the known BBvivo bioavailability and the extended group of cases 1110, we have found worse fitting of the HPLC data to the B2 parameter than that of the TLC data [23]. It is noteworthy that the visible reduction of the number of cases (13%) does not change the level of the BBvivo/B2 correlation which is R = 0.54 for n = 40 and R = 0.55 for n = 46 [23].

The compounds 1110 studied currently have been analyzed at an earlier stage of our investigations (TLC analysis involving 154 cases, Reference [23]). MLR analysis reported in Reference [23] was based on the physicochemical parameters of studied compounds and their correlation with BBvivo and B2 descriptors. The computational models generated then were highly satisfying and validated with QLOO and QLMO. We have therefore considered it unnecessary to process the MLR analysis of all physicochemical data collected earlier [23] for the currently studied group. On the other hand, since the physicochemical data selected via MLR analysis are then used in discriminant function analysis (DFA), confirmation of the quality of selected descriptors is justification of their further use. The models presented below give the relationships between the dependent variables BBvivo and B2, established for compounds 140 and 1110, using the basic parameters of the appropriate computational models confirmed in our previous experiment [23] (for the list of statistics for the particular models, see Table 2, Supplementary Materials).

In the group of 46 compounds with the established BBB bioavailability, the significance of the following parameters was presented [23]: total dipole moment (DM (D)), the number of H-bond acceptors (HA), energy of the highest occupied molecular orbital (eH (eV)), distribution coefficient (log D), and van der Waals molecular volume (V3)), correlated with the dependent variable BBvivo (R = 0.77; R2 = 0.59; n = 46) in Reference [23]. The same independent variables introduced into the analysis of compounds 140 gave the following result:

The strongest correlations with the calculated measure of the BBB permeability B2 were for DM, HA, and HD. The parameters of the equation proposed earlier [23] were very promising . The current results for the group 140 confirm the value of the same descriptors:

The same variables (HD, HA, and DM) were efficient in the group of 154 compounds [23] and 110 cases (1110):

Even if the results change following the reduction of a group of studied cases, this comparison may be considered the confirmation of the significance of all molecular descriptors in studies of the BBB permeation.

Our further investigations concentrated on the possibility of using RP-18 HPLC to partially mimic the physiological conditions of crossing the BBB. It was assumed that the results of biochromatographic experiments should improve the predictive capabilities of purely computational models.

RP-18 HPLC data alone gave statistically significant correlations neither with BBvivo (R = 0.41 n = 40) nor with B2 (R = 0.34 for 40 cases or R = 0.43 for 110 cases, respectively). Molecular descriptors closely related to the compounds’ bioavailability should, however, contribute to a predictive value of the model.

In the next MR analysis, these physicochemical data were combined with HPLC chromatographic parameters (log k and (log k)/PSA). Investigations of the correlations of these independent variables with the BBvivo descriptors gave a significant increase in the correlation coefficient:

The model (Figures 1 and 2), established by the forward stepwise method, does not contain the variable (log k)/PSA. Model (4) explains almost 70% of the total BBvivo variance. The result is better than in the case of both groups of variables, chromatographic and physicochemical (15% and 62%, respectively), considered separately.

The improvement in the result of BBvivo variation studies after the chromatographic data have been introduced suggests the possibility of using these data in modeling of the variation of the computed BBB permeation descriptor B2. Multiple regression analysis was performed first for the group of compounds 140 and then 1110.

In the case of 140 group, the outcome is also very good although stepwise analysis does not introduce the (log k)/PSA variable into the model. Equation (5) and Figures 3 and 4 explains 96% of the total variance and may have a predictive ability in estimating the BBB permeation:

Analysis of all the cases 1110 studied by HPLC also gave a good and expected result. This time the model contained the chromatographic parameter modified with the PSA value. The model described by equation (6) and Figures 5 and 6 explains 90% of the total B2 variance and may have a predictive potential:

All the models presented above point to a leading role of the physicochemical parameters. Introduction of the chromatographic data to the stepwise MR analysis confirms their relationship with BBvivo and B2, increasing the correlation coefficient or improving other models’ parameters.

In all our previous experiments based on thin-layer chromatography, the parameter Rf/PSA played a visibly important role [2123]. A sharp threshold limit of its value was found to separate compounds penetrating the brain (CNS+ ≥ 0.009) and those that do not cross the BBB (CNS < 0.009). This proves a great significance of this modification for the TLC RP-18 chromatographic data because the Rf values studied for the same group of 110 cases varied across a very broad range (0.1–0.9 for CNS+ and 0.5–0.9 for CNS−) [23]. The confirmed stability of the Rf/PSA parameter in predictions of the BBB permeation turned our attention to the (log k)/PSA parameter in the HPLC model. However, the results reported above point to a better fit of the log k parameter that is present in each model involving the chromatographic data. The (log k)/PSA parameter is in the majority of cases either not introduced into a model in the stepwise regression analysis, or its presence does not improve the result. Establishing the theoretical threshold limit for (log k)/PSA, log k, or even k and k/PSA between CNS+ and CNS compounds is impossible.

3.2. Discriminant Function Analysis

In discriminant function analysis (DFA), a qualitative parameter CNS was used as a grouping variable. The compounds described in the literature and the database [19, 39] as crossing the BBB were defined as CNS+ (75 cases, code 1), and the compounds that do not cross the BBB or cross it with the probability <0.6 were defined as CNS (26 cases, code 0). For the remaining 9 cases, the qualitative parameter CNS± has not been established.

At first, the stepwise DFA analysis was performed for 40 cases with known CNS bioavailability (compounds 140). The analysis was based on the physicochemical and chromatographic parameters standardized for this group that were selected by MR analysis (4): HD, HA, log D, eH, V, DM, and log k. Additionally, HD and (log k)/PSA were introduced into the analysis. We have achieved the full discrimination of CNS+ and CNS cases in five steps. The classification is based on the following discriminating variables: HD, eH, V, DM, and log k. DFA performed by the stepwise method also introduces (log k)/PSA that does not change the result of the discrimination. Summarizing the results of the discriminant function analysis, we can see that the most important discriminating variables are HD, DM, and V, but only by introducing log k, full (100%) discrimination of the group can be achieved.

The classification functions for each group of activity CNS and CNS+ were calculated:

The outcome of the last DFA was verified demonstrating the high classification power of the model (Wilks’ lambda parameter = 0.337195; χ2 = 38.59180; level = 0.000001).

The reliability of the model derived from the DFA was determined by a cross-validation test based on the leave-one-out methodology. All cases with measured BBB permeability (BBvivo) (140) were examined. The procedure is described above. The results obtained with the methodology are presented in the cross-validation matrix (Table 1). Using the results obtained via the cross-validation methodology, we can confirm the reliability of the DFA model (the cross-validation error is 0%). The examination of the cross-validation matrix suggested that the classification probability was the same as the classification probability obtained a posteriori (100.00%). The reliability of the analyses was proved. The models presented above not only describe precisely the investigated groups of cases, but they have a diagnostics value for new cases.

In order to validate the methods and confirm the discriminating value of the descriptors selected in the course of this study, a group of 61 cases without measured BBvivo but with the BBB permeability known from the database [42] (4072, 7492, 9698, 100104, 108, and 110) was introduced. The predictive values of the determined classification functions (equations (7) and (8)) proposed for the classification of compounds with the functions CNS± were assessed. The results are additionally presented as the probability of assigning a case to CNS or CNS+ groups. For this purpose, the variables calculated for the 61 compounds with the known BBB permeability CNS (20 cases) and CNS+ (41 cases) were added to the raw variable file for the compounds 140 and then standardized together.

The classification functions (7) and (8) were subsequently applied to calculate the appropriate qualification values for the compounds 4144, 46, 4855, 68, 89-90, 9698, and 104 (20 CNS cases) and 45, 47, 5667, 6972, 7481, 8388, 91-92, 100103, 108, and 110 (41 CNS+ cases) (Table 4, Supplementary Materials).

On the basis of the DFA classification function, obtained for 40 cases with the established BBvivo values, 5 out of 20 CNS cases were misclassified vs. 3 out of 41 CNS+ misclassified compounds. The total number of correctly classified cases reached 87.89%.

Just like in the case of the training set (140), the stepwise DFA was carried out for the set of all compounds with defined CNS± (101 cases) (all 1110 without 73, 9395, 99, 105107, and 109), involving the independent variables used in the previous analyses (equations (9) and (10)). The classification matrix for the total of 101 cases is given in Table 2.

The classification functions for each group of CNS− and CNS+ activity were calculated:

The outcome of the last DFA was verified demonstrating the high classification power of the model (Wilks’ lambda parameter = 0.429066, χ2 = 81.65287, level = 0.000001).

Only 5 cases in the group of 101 compounds were incorrectly classified.

Both of the chromatographic parameters are discriminating variables.

The predictive values of the determined classification functions (equations (9) and (10)) proposed for the classification of compounds with the functions CNS± were assessed. The results are additionally presented as the probability of assigning a case to CNS− or CNS+ groups, as a probability a posteriori for the compounds 4144, 46, 4855, 68, 89-90, 9698, and 104 (20 CNS− cases) and 45, 47, 5667, 6972, 7481, 8388, 91-92, 100103, 108, and 110 (41 CNS+ cases) (Table 4, Supplementary Materials). Only 3 out of 20 CNS− cases were misclassified compounds. The total number of correctly classified cases reached 95.08%.

Summarizing the results of the discriminant analysis, we have concluded that all the variables in the model are statistically significant.

4. Conclusions

Application of the RP-18 TLC analysis for the prediction of the CNS bioavailability of solutes has been a starting point of our RP-18 HPLC model. The mobile phase consisting of phosphate buffer with pH 7.4 and acetonitrile was used, and a novel strategy to predict the BBB permeation was developed using the RP-18 HPLC-derived chromatographic data [2123].

The correlation of the BBB permeation coefficient and chromatographic parameters log k and (log k)/PSA has been studied for the cases with known BBvivo value. The relationship between BBvivo and the chromatographic data is at a similar level to that achieved in our RP-18 TLC experiments using Rf and Rf/PSA descriptors, explaining ca. 15% of the total BBvivo variance.

The correlation of the same chromatographic data with the calculated and previously used permeability coefficient B2 has also been studied [23]. Only the cases with the established BBvivo value (140) were considered. It was concluded that the relationship between B2 and the HPLC chromatographic data is at a significantly lower level than that achieved in our TLC experiment involving the Rf and Rf/PSA descriptors (R2 = 0.44) [23] and explains only 10% of the total variance. In the case of the calculated B2 permeability coefficient, it is possible to study a model of the whole group of cases (1110). This model explains ca. 20% of the B2 variance, but the comparison with the correlation of B2 vs. Rf and Rf/PSA variables has once again pointed to the advantages of the TLC model that explains the total variance in the group of 154 cases in over 30%. The level of correlation BBvivo/B2 does not change in both experiments.

MLR analysis based upon the physicochemical data confirms the value of the molecular descriptors, related to the CNS bioavailability. These variables, combined with the chromatographic data, make it possible to generate computational models, explaining 70–96% of the total variance of the CNS bioavailability. All of our models of the BBB permeability point to a leading role of the physicochemical parameters. Introduction of the chromatographic data to the stepwise MR analysis confirms their relationship with BBvivo and B2, increasing the correlation coefficient or improving other model parameters. Contrary to the previous experiments [2123], the significance of the chromatographic data modified with PSA has not been confirmed. Nevertheless, the introduction of the chromatographic parameter log k contributes towards the model reliability and improves the statistics (equation (4) compared to equation (1)), so chromatographic experiments are definitely worth the effort, especially since we use the simple “single chromatographic run” methodology rather than chromatographic data obtained from a series of experiments by extrapolation. The log k is present in every model involving the chromatographic data; the (log k)/PSA parameter is either not introduced into a model in the stepwise regression analysis or it does not improve the result. Establishing a theoretical threshold limit of (log k)/PSA, log k, or even k and k/PSA for CNS+ and CNS− is impossible. Discriminant function analysis based on a group of cases with known BBvivo separates the CNS+ and CNS− compounds 100% correctly. Validation of equations (7) and (8) defining these groups with 61 cases of unknown BBvivo was successful. The conformity of the CNS± bioavailability known from literature sources was 88% correct.

DFA for all the 101 cases with the known CNS± parameter resulted in 95% correct classification. On the basis of this result, we have concluded that the RP-18 HPLC analytical model is entirely successful in studies and predictions of the BBB permeability.

The final comparison of the RP-18 TLC and RP-18 HPLC analytical models points to a small predictive advantage of TLC. The stability of Rf/PSA with the repetitive threshold limit for compounds that cross the BBB (CNS+ ≥ 0.009) and those that do not enter the brain (CNS− < 0.009) [23] suggests a large significance of this modification of the RP-18 thin-layer chromatographic data. Unfortunately, it was impossible to establish such a theoretical threshold limit for the RP-18 HPLC data we have collected.

The selection of a better chromatographic model to predict the CNS bioavailability is facilitated by the comparison of timing, simplicity, and costs of both experiments.

Data Availability

The molecular descriptors and chromatographic data used to support the findings of this study are included within the supplementary information file.

Conflicts of Interest

The authors declare that there are no conflicts of interest related to the publication of this manuscript.

Acknowledgments

This work was supported by an internal grant of the Medical University of Lodz (no. 503/3-016-03/503-31-001).

Supplementary Materials

Supplementary 1. Figure 1: structures of compounds 1110.

Supplementary 2. Table 1: calculated descriptors for compounds 1110.

Supplementary 3. Table 2: statistics for Equations (1)–(6).

Supplementary 4. Table 4: calculated probability classification for compounds 4144, 46, 4855, 68, 89-90, 9698, and 104 of CNS− activity and 45, 47, 5667, 6972, 7481, 8388, 91-92, 100103, 108, and 110 of CNS+ activity (the probability of assigning case to CNS− and CNS+ groups).