Abstract

We developed the quantative structure-property relationships (QSPRs) models to correlate the molecular structures of surfactant, cosurfactant, oil, and drug with the solubility of poorly water-soluble 2-aryl propionic acid nonsteroidal anti-inflammatory drugs (2-APA-NSAIDs) in self-emulsifying drug delivery systems (SEDDSs). The compositions were encoded with electronic, geometrical, topological, and quantum chemical descriptors. To obtain reliable predictions, we used multiple linear regression (MLR) and artificial neural network (ANN) methods for model development. The obtained equations were validated using a test set of 42 formulations and showed a great predictive power, and linear models were found to be better than nonlinear ones. The obtained QSPR models would greatly facilitate fast screening for the optimal formulations of SEDDS at the early stage of drug development and minimize experimental effort.

1. Introduction

Low water solubility of many drug candidates has been a big challenge to pharmaceutical industry since the oral delivery of these drugs may lead to low bioavailability high intra- and intersubject variability [1]. Several formulation approaches to improve solubility of these drugs have been investigated including cyclodextrins [2], micelles [3], nanoparticles [4], solid dispersions [5], and self-emulsifying drug delivery systems (SEDDSs). SEDDS are isotropic mixtures of an oil, surfactant, co-surfactant and drug that form O/W emulsion or microemulsion when introduced into aqueous phases under gentle agitation [68]. They can enhance the oral bioavailability of hydrophobic drugs, which are attractive carriers for poorly water-soluble drugs [811]. Dissolution in SEDDS and no precipitation in the gastrointestinal tract are some of the prerequisites for the efficient intestinal absorption of drugs [12]. The drug solubility in SEDDS is a key parameter to select optimal formulations [13].

Pharmaceutical preparation is a complicated procedure including preformulation studies, formulation screening, technology optimization, and stability studies. Among them, screening for the optimum formulation is a crucial step. Usually, the first stage is to select suitable excipients and preparation technology through preliminary experiments, and then to screen for the optimized formulation using single-factor design, orthogonal design, or uniform design. These experimental processes are expensive and time consuming. Therefore, estimating properties using theoretical modeling is an efficient way for formulation screening. Quantitative structure-property relationships (QSPRs) are the process by which chemical structure is quantitatively correlated with its physical, chemical, or biological property. It has been widely used in pharmaceutical research [1416] including predicting the biological activity [17], absorption [18, 19], distribution [20, 21], metabolism, excretion [22], and chemical reactivity-related toxicity [23] (ADMET) properties of drugs. However, QSPR is rarely applied in the pharmaceutics [2426] since numerous factors might affect the preparation process. Therefore, it is a good attempt to introduce QSPR into pharmaceutics, establishing the relationship between the property of formulation and the chemical structure of compositions by mathematical methods, which will decrease the experimental time.

The aim of this study was to develop available QSPR models for predicting the drug solubility in SEDDS. We investigated a set of poorly water-soluble 2-aryl propionic acid nonsteroidal anti-inflammatory drugs (2-APA-NSAIDs). We then applied the model such obtained to understand the solubility mechanism of drug in SEDDS as well as to fast screen for the optimized formulations.

2. Materials and Methods

2.1. Materials

Ketoprofen was provided by Southwest Pharmaceuticals Co., Ltd. (Chongqing, China). Flurbiprofen and loxoprofen were purchased from Wuhan Yuancheng Technology Development Co., Ltd. (Wuhan, China). Ibuprofen was a gift from Hubei Biocause Pharmaceutical Co., Ltd. (Hubei, China). Naproxen was obtained from Chengdu Jinhua Pharmaceutical Co., Ltd. (Chengdu, China). Carprofen was purchased from Shandong Fangxing Technology Development Co., Ltd. (Shandong, China). All other agents were of analytical grade.

2.2. Data Collection
2.2.1. Preparation Self-Emulsifying Mixtures

SEDDSs consisted of surfactant, cosurfactant oil, and drug. Surfactants employed were Tween20, Tween40, and Tween80. Oil and cosurfactant selected in the present study had definite, simple structures and commonly used in pharmaceutics. Table 1 shows the composition of the formulations. The weight ratio of surfactant to cosurfactant (Km) varied as 1 : 2, 1 : 1, 2 : 1, 3 : 1, and 4 : 1. The self-emulsifying mixtures containing oil, surfactant and cosurfactant, were prepared at a specific ratio of oil to surfactant/cosurfactant mixture (Smix), 5 : 95, 10 : 90, and 15 : 85 (w/w). Each component was accurately weighed in the same screw-cap tubes and mixed by gentle stirring and vortex-mixing. Model drugs were hydrophobic 2-aryl propionic acid NSAIDs including ketoprofen, ibuprofen, flurbiprofen, naproxen, loxoprofen, and carprofen. The structures of these drugs are shown in Figure 1.

2.2.2. Solubility Studies

In the study, 0.1 g self-emulsifying mixture was diluted with distilled water to 5 ml in a sealed tube and gently mixed by a Vortex mixture (Ika, Germany). An excess amount of drugs was added to the formed microemulsions or emulsions. The blend was mixed and left to equilibrate at 37°C for 48 h in a water bath and then centrifuged at 6,000 rpm for 10 min. The supernatant was filtered through a filter membrane (0.22 μm), diluted with methanol to a suitable concentration range, and quantified by HPLC (see Section 2.2.3).

2.2.3. HPLC Analysis of the Model Drugs

The HPLC analysis was performed with a Waters pump 515 and a UV-VIS detector 2487. The column was a Diamosil C18 100 mm × 4.6 mm column (Dikama, China). The mobile phase consisted of a mixture of methanol, water, and phosphoric acid (20 : 80 : 0.1, v/v/v). The UV detector wavelengths were set at 254 nm (ketoprofen), 222 nm (ibuprofen), 247 nm (flurbiprofen), 273 nm (naproxen), 222 nm (loxoprofen), 300 nm (carprofen), respectively. The elution was carried out at a flow rate of 1.0 mL/min, and the temperature of column oven (PH-730A, Phenomen, China) was set to 30°C. Each measurement was repeated for three times.

2.3. Descriptor Generation and Variable Selection

Molecular descriptors are commonly used to represent the structural and physicochemical features of compositions, so that they can be used in a QSPR model. Thus, to establish a QSPR model, Ab initio quantum mechanical calculations were first performed for relevant molecular descriptors using Gaussian 03 software package (Gaussian 03, Gaussian, Inc., Pittsburgh, 2003.). Geometric optimization and quantum chemical, electrostatic parameters were calculated at RHF/6-31G* level. Quantum chemical parameters including the dipole moment (Dipole), the energy of the highest occupied molecular orbital (EHOMO), and the lowest unoccupied molecular orbital (ELUMO) as well as electrostatic parameters including MaxQ, MaxQ+, ABSQ, and ABSQon were obtained. In addition, Discovery Studio 1.7 package (Accelrys Inc., USA) was used to calculate parameters such as molecular volume, polar surface area, wiener index, logD, and logP. Constitutional parameters including surfactant ratio (SR), cosurfactant ratio (CoSR), and oil ratio (OR) were also calculated. Table 2 shows the values of important descriptors.

Nonionic surfactants, Tween20, Tween40, and Tween80 belong to the polyoxyethylene sorbitan family. They have similar head structures, and the difference observed in behavior is mainly due to different hydrophobic portions [27]. So each surfactant structure was cleaved into two parts: the same hydrophilic segment (HS) and a different lipophilic segment (LS); and their descriptors were calculated separately. The cleavage method was performed as in Taha et al. [26].

The role of cosurfactant in the formation of SEDDS is to increase the interfacial flexibility by extending into the surfactant interfacial monolayer and consequently creating void space among the surfactant molecules [13]. Both surfactant and cosurfactant in SEDDS are used to reduce the interfacial tension. So for simplification purpose, we combined the descriptors of surfactant and cosurfactant together. The overall descriptor was calculated as follows: where is the ratio (w/w) of surfactant; is the molecular descriptor of lipophilic segment of surfactant. is the ratio (w/w) of cosurfactant; is the molecular descriptor of cosurfactant.

The descriptors were selected to make a stable and interpretable model. A three-stage manual descriptor selection process was performed: (1) descriptors with too many zero values or the same values (descriptors of Tween HS) were eliminated; (2) descriptors with very small standard deviation values (<0.5%) were removed; (3) a particular descriptor was chosen to represent a group of highly correlated variables (correlation coefficients >0.80), thereby minimizing the redundancy and overlapping of the descriptors. Since the ranges of descriptor values influence the quality of the models generated, we normalized the rest descriptor values to a range of 0 to 1 [28].

2.4. QSPR Modeling

To begin the model development process, the solubility data of drugs in formula 1–6 were split into a training set (80% of the total number of formulations) and an internal validation set (20% of the total number of formulations) randomly. The solubility data of drugs in formula 7–8 were used as a predicting set. The selected descriptors in Section 2.3 were regressed against the solubility of the training set by means of multiple linear regression (MLR). The best equations were determined based on the highest squared multiple correlation coefficient (), Fisher ration (), and lowest standard error ().

Artificial neural network (ANN) is a proper method for modeling nonlinear relationship [29]. It was also attempted to develop the better predictive models. All networks used in this study were three-layered back-propagation (BP) type. The input data included the descriptors selected in linear models, and the output neuron referred to the solubility values of drugs in SEDDS. Sigmoid transfer functions were used in all layers. The number of neurons in the hidden layer was adjusted to optimize the network, and the best model gave the highest correlation coefficient () and the lowest MSE. The internal validation set (18 formulations) was used to prevent the overfitting.

2.5. Statistical Analysis

To evaluate the predictive ability of QSPR models, the statistical parameters of mean square error (MSE), root mean square error of prediction (RMSEP), the RMSE, the relative standard error of prediction (RSEP), and mean absolute error (MAE) [30] were used. Table 3 shows these equations.

3. Results and Discussion

3.1. QSPR Models

Table 4 shows the solubility of 2-APA-NSAIDs in various formulations.

3.1.1. MLR

The best MLR models were given as follows: In all the equations, variable inflation factor (VIF) was less than 10, suggesting the absence of multicollinearity. As shown in Table 5, the correlation matrix for these descriptors shows no high correlation between variables and could be used to develop QSPR models. The statistical results indicate that these equations represent good models for calculating the solubility (Table 6).

Models in (2) shows the significance of the combination of SR, OR, O-MaxQ, O-ABSQ, O-ELUMO, S-Volume, and S-Dipole in the solubility of drugs in SEDDS. According to -test criterion, the most important descriptor is SR. The positive coefficient suggests that high-concentration surfactant will increase the solubility. Surfactant plays an important role in O/W microemulsion/emulsion formation: it forms a layer around emulsion droplets, which reduces the interfacial energy and provides a mechanical barrier to coalescence [31]. And the result suggests that drugs are mainly dissolved in the phase of surfactant.

The specific effect of O-MaxQ, O-ABSQ, and S-Dipole to the solubility depends on the drug type.

3.1.2. ANN

ANN models were constructed with the same descriptors as in MLR models using Leavenberg-Marquardt (LM) algorithm as activity function. The proper number of neurons in the hidden layer was set as 10 to ensure the lowest mean square error (MSE). Table 6 shows the statistical qualities of the ANN models, compared with MLR models. of QSPR models indicate that they can explain more than 90% of the variation in the formulations, which correspond to a significant explanatory capacity.

3.2. QSPR Models for Solubility Prediction

Table 7 shows the solubility prediction for the internal and external validation sets obtained from these models. As shown in Figures 2(a)2(f), the plots of experimental values versus predicted values obtained by the MLR and ANN modeling indicate good correlations between the experimental and predicted values and confirm the satisfied predictive ability of QSPR models.

A statistical evaluation of both MLR and ANN models is shown in Table 8. According to the comparison between the two models in this study, except for the model drugs of ketoprofen and naproxen, MLR was found to be more reliable for the solubility prediction than ANN.

Based on the models, the optimal formulations in internal validation set were as follows: oleic acid/Tween20/ethanol (Km = 4 : 1, 0.5 : 9.5) for ketoprofen; butyl oleate/Tween40/isopropyl alcohol (Km = 3 : 1, 1.5 : 8.5) for ibuprofen, flurbiprofen, and naproxen; oleic acid/Tween20/Ethanol (Km = 2 : 1, 1.5 : 8.5) for loxoprofen; methyl oleate/Tween80/diethylene glycol monoethyl ether for carprofen. The best formulations for the predicting set were as follows: butyl oleate/Tween20/isopropyl alcohol (Km = 4 : 1, 1.5 : 8.5) for ketoprofen and ibuprofen; ethyl oleate/Tween40/ethanol (Km = 4 : 1, 1.5 : 8.5) for flurbiprofen, loxoprofen, and carprofen; butyl oleate/Tween20/isopropyl alcohol (Km = 4 : 1, 0.5 : 9.5) for naproxen. All the predicted optimum formulations were consistent with the experimental ones except for naproxen, indicating the significance of the models in formulation screening.

3.3. The Drug Effect on the Solubility

To examine the influence of drugs on the solubility, the descriptors of drugs (X) were correlated with the drug solubility in different formulations (Y). The multiple linear regression analyses gave the following equations: Equation (3) reveals a significant effect of the shape-related descriptor (Wiener index), charge-related descriptor (MaxQ+, Dipole moment), quantum chemical parameter (EHOMO), and logD on the solubility of 2-APA-NSAIDs in SEDDS. According to -test criterion, the most important factors are Dipole, MaxQ+, wiener index, and logD. The negative coefficient of wiener index showed that a drug with small size tended to have a good solubility in SEDDS. The positive coefficient of logD indicated that the increase of lipophilicity favors the solubility.

4. Conclusions

In the present study, we used QSPR to predict the solubility of 2-APA-NSAIDs in self-emulsifying drug delivery system by means of linear and nonlinear methods. We examined the effects of component ratio, stereoscopic effect, hydrophobic interactions, and electric effect on the solubility by MLR and ANN. In all the models, the ratio of compositions (SR, OR), charge-related descriptor, and the quantum chemical parameter (EHOMO) appeared to be the most important factors. The obtained models in (3) indicate the significance of wiener index, charge-related descriptor, and logD of drugs on the solubility. The results of MLR and ANN methods were satisfactory, and nonlinear models were not found to be superior to linear models. Since the predicted optimum formulations were consistent with the experimental ones, the QSPR models obtained would be useful to predict the solubility of 2-APA-NSAIDs in SEDDS, screen for the optimal formulation, and reduce experimental time.

Acknowledgment

This study was supported by the National Natural Science Foundation of the People’s Republic of China (Grant no. 30973659).