Online NIR Analysis and Prediction Model for Synthesis Process of Ethyl 2-Chloropropionate
Online near-infrared spectroscopy was used as a process analysis technique in the synthesis of 2-chloropropionate for the first time. Then, the partial least squares regression (PLSR) quantitative model of the product solution concentration was established and optimized. Correlation coefficient () of partial least squares regression (PLSR) calibration model was 0.9944, and the root mean square error of correction (RMSEC) was 0.018105 mol/L. These values of PLSR and RMSEC could prove that the quantitative calibration model had good performance. Moreover, the root mean square error of prediction (RMSEP) of validation set was 0.036429 mol/L. The results were very similar to those of offline gas chromatographic analysis, which could prove the method was valid.
Ethyl 2-chloropropionate (CAS number 535-13-7) is a clear colorless liquid with a pungent odor. Its flash point is 100°F and it is denser than water and insoluble in water. In recent years, as an important chemical intermediate and industrial reagent, it has been popularly applied in the synthesis of herbicides (e.g., phenoxypropionates, 2-(4-hydroxyphenoxy)propionate, and amino (or aryloxy) sulfonyl phenoxy propanates), plant auxiliaries (e.g., 2-chloroethyl trimethyl ammonium chloride, dimethylaminosuccinic acid), nonsteroidal antipyretic and anti-inflammatory drugs (e.g., naproxen, indomethacin, and ibuprofen), and so forth. Though the synthetic process of ethyl 2-chloropropionate is relatively simple, the current offline quantitative method for its production monitoring and quality control can hardly meet the requirements of related researchers and producers.
Previously, Food and Drug Administration (FDA) issued a guidance document to pharmaceutical industry regarding the implementation of process analytical technology (PAT) in 2004. Process analytical technology (PAT) has been defined as “a system for designing, analyzing, and controlling manufacturing through timely measurements of critical quality and performance attributes of raw and in-process materials and processes, with the goal of ensuring final product quality” . Recently, the application of near-infrared (NIR) spectroscopy has grown rapidly as an efficient online monitoring technique , which has been used as an ideal tool for PAT. The growing concentration on NIR is probably a direct result for its advantages of outstanding sensitivity, high speed, low noise, nondestruction, and enabling the analysis of complex samples without the need for pure samples compared to others [3–5]. Near-infrared (NIR) spectroscopy was used as a process analytical technology to monitor the amino acids concentration profile during hydrolysis process of Cornu Bubali by Wu et al. . And the use of near-infrared diffuse reflectance spectroscopy for qualification of Ginkgo biloba extract was described as raw material for use in pharmaceutical products by Rose and coworkers . NIR spectroscopy has also been used as an analyzer to determine the effect of several operating conditions on recovery, selectivity, and productivity for production of methyl isobutyl ketone (MIBK). The use of this PAT approach enabled the researchers to perform the necessary experiments in a time-efficient fashion and resulted in 30% improved productivity of MIBK .
Based on the above research status, the aim of this study was to use UV-NIR spectroscopy for online and nondestructive analysis of synthesis process of 2-chloropropionate catalyzed by ion exchange resin for the first time. The method of model updating was utilized to make the models more accurate and obtained better prediction results. The concentration values were very close to those obtained by offline gas chromatographic analysis. The developed method was supposed to provide foundation for further process chemical analysis and useful reference for similar online analytical research of synthetic reaction.
2.1. Chemicals and Materials
All reagents used were of analytical grade. Methanol, ethanol, 2-chloropropionic acid (purity: 0.9956%), and ethyl 2-chloropropionate (purity: 0.9913%) were obtained from Kelong Chemical Inc. (Chengdu, China). D001 strong acidic cation exchange resin was purchased from Shengquan Chemical Inc. (Langfang, China). NIR spectrometer (NIRQUEST512), DH-2000-BAL deuterium light source, and optical fiber (T300-UV-VIS) were obtained from Ocean Optics Inc. (Dunedin, USA). Norprene chemical tubing was purchased from Saint Gobain Inc. (USA). GC-7900 gas chromatographic system (Tianmei Scientific Instrumental Inc., Shanghai, China) was used in the quantitative analysis, which was equipped with GH-300 hydrogen generator and flame ionization detector.
2.2. Experimental Procedure and Equipment
The scheme of experimental device is shown in Figure 1, which includes condensator, peristaltic pump, NIR detector, workstation, optical source, detection cell, microsyringe connected with teflon tube, optical fiber, and signal line. In the three-necked flask, 2-chloropropionic acid was esterified with ethanol under the catalysis of strong acidic cation exchange resin. Because the synthesis of ethyl 2-chloropropionate was performed at 110°C under refluxing, the condensator was added before the NIR spectrometer in order to eliminate effects of temperature fluctuation on spectral signals. Meanwhile, the peristaltic pump was adjusted with the stable flow rate of 0.2 mL/s through repeated experiments to meet the requirement of ideal online detection. In the self-made detection cell, the fiber-optic probe acquired useful NIR signals and the microsyringe could collect the samples in the sampling port. Then, the sample solution was analyzed with GC method. The concentration () of the product ethyl 2-chloropropionate can be obtained through the following equation:where is the initial concentration of 2-chloropropionic acid ethanol solution, and are the peak areas of 2-chloropropionic acid and ethyl 2-chloropropionate in the collected samples, respectively, and is the relative correction factor of 2-chloropropionic acid to ethyl 2-chloropropionate.
2.3. Gas Chromatographic Analysis
The sample of reaction product was drawn out from the pipeline of PAT system at regular intervals. It was diluted 5 times with dehydrated alcohol and then was filtrated by 0.22 m Millipore filter. 5 μL sample solution was injected into GC instrument and analyzed under the following conditions: nonpolar TM-1 capillary column (15 m × 0.53 × 0.5 μm), temperature programming (started at 90°C, holding for 0.5 min, and then increased to 165°C at a rate of 25°C/min), the temperature of injection port, and FID detector under 180°C, N2 at 65 mL/min as carrier gas. As a result, the solvent of alcohol, the product of ethyl 2-chloropropionate ( min), and the substrate of 2-chloropropionic acid ( min) could reach the baseline separation in 3 minutes. The results of GC chromatogram of ethanol, ethyl 2-chloropropionate, and 2-chloropropionic acid are shown in Figure 2.
2.4. NIR Spectroscopy Collection
The NIR spectroscopic data of training set for the reaction process were collected with 3 replications continuously every 30 s, and the average value was determined as the spectral absorption data of the sample at this time. Data of calibration set were the average of spectral values in three times which were collected once every 1 min in another batch synthetic process .
Spectra acquisition conditions were presented as follows: in the transmission mode, the background of unreacted solution was used as a reference, and the detection wavelength ranged from 900 nm to 1800 nm with a resolution of 4 nm; an integral time was 280 ms and the optical path was 2 cm. The stacked original NIR spectra of all the samples are shown in Figure 3.
2.5. Data Analysis
Spectral data were manipulated by identifying the optimal spectral regions and selecting appropriate pretreatment methods, and then they were correlated with the data measured by the reference assays using PLSR to develop calibration models [10, 11]. The performance of the calibration models was assessed in terms of root mean square error (RMSE), correlation coefficient (), root mean square error of cross-validation (RMSECV), and relative standard error of prediction (RSEP) .
In PLSR algorithm, including more PLSR factors in the model would better fit the modeling data, but the prediction accuracy of the other samples might become worse. This phenomenon was called “overfitting” of a model. In this case, the corresponding components should be eliminated effectively (including noise, nonspectral measurement information) [13, 14]. The calibration models with the highest and the lowest RMSEC and RMSEP with the least difference from each other were considered optimal.
3. Results and Discussion
3.1. Results of Reference GC Assays with Internal Standard Method
Based on the result of GC analytical conditions in Section 2.3, the standard compound of 2-chloropropionic acid was used to determine the relative correction factor () of ethyl 2-chloropropionate. The related data of internal standard method are listed in Table 1, which are further linearly fitted as shown in Figure 4. The linear equation is (). As a result, was determined as 0.3826 and a good linear relationship could be obtained in the range of molar ratio from 0.0362 to 7.3567.
The validation of the above GC analytical method was also studied. Standard linearity was tested using linear regression and ethyl 2-chloropropionate together with 2-chloropropionic acid showed excellent linearity with correlation coefficient greater than 0.999 in the studied range. Within-run precision was measured using RSD for six replicate standards of 2-chloropropionic acid and ethyl 2-chloropropionate. RSD values of retention time and the peak areas for 2-chloropropionic acid were within 0.16% and 1.56%. For ethyl 2-chloropropionate, the RSD values of retention time and the peak areas were within 0.37% and 1.98%. The accuracy of the GC method was validated by adding two standards to the known concentration of their samples. As a result, average recoveries of 3 replicates were between 96.81% and 101.25%. The above results showed that the analytical method was acceptable.
3.2. Model Development
3.2.1. Spectra Band Selection
A quantitative model was established by PLSR, in which all the wavelengths could be analyzed and processed. But part of the signals in the spectrum (produced by solvent, temperature, variance of flow rate, etc.) was not related to the target compound, and the performance of the model would be affected at the same time. A reasonable wavelength can improve the prosperity of the model with a small amount of computation . The comparison of different regions is shown in Table 2. According to the results, the interval of 1000–1240 nm was finally selected to establish calibration model of ethyl 2-chloropropionate content.
The origin spectra collected between 899.07 and 1264.06 nm in the reaction process are shown in Figure 5. It can be found that the absorbance in two regions of 1003.79–1134.40 nm and 1212.14–1240.18 nm will decrease with increasing reaction time. Because the content of 2-chloropropionic acid becomes less and less, the intensity of related absorbance band of –COOH group will decrease. However, there is a turning point appearing in 1134.40 nm, and the absorbance in the region of 1134–121.14 nm will rise continuously with the increasing content of ethyl 2-chloropropionate. The region is closely related with the characteristic absorption of –COOEt group.
3.2.2. Pretreatment of Original NIR Spectra
NIR spectra of the samples, which contain much chemical information, need to be pretreated to ensure accurate analysis . There are many factors that would cause interference in the spectral measurement process, so most of the NIR measurement methods require the use of chemometric treatment . Taking measurement to eliminate outside interference to some extent is very necessary and helpful to optimize the performance of a quantitative model.
Table 3 shows the comparison results of related pretreatment methods and their combinations. It can be seen from the table that the various pretreatment methods can improve the performance of models to different extent. In many cases, derivatives can reduce peak overlap and eliminate linear baseline drift. But the noise level increases slightly. In addition, the standard normal variate (SNV) was applied to reduce the changes in the path length and to reclaim the light scattering. Considering that some pretreatments even exhibited negative values, these treatments were avoided. Obviously, the combination of SNV and second-order derivative method was superior to other pretreatment methods, which presented the greatest value of and the smallest RMSE. Therefore, SNV + second-order derivative was selected to pretreat the original data.
3.2.3. Selection of Optimal Number of Principal Components
The decomposition model of data compression with PLSR method is different from other chemical decomposition models. It explores the matrix of absorbance and the matrix of the reference value of the concentrations . Respectively, scoring matrix and loading could be calculated (the two scoring matrixes could exchange alternatively). Meanwhile, this method can make the reference of concentration related to the spectrum information better. In PLSR process, cross-validation based on the samples from training set and corrected set was conducted, and the predicted residual error sum of squares (PRESS) was calculated at one time . The PLSR models with 0–12 factors were investigated, and the optimum number of factors employed in PLSR was determined by PRESS. The distribution of the PRESS is shown in Figure 6. It can be concluded that the values of the previous seven main constituents are decreasing. And the result explains that the increasing main constituents are responsible for the content of ethyl 2-chloropropionate. When the principal components were less than 7, this established model was underfitting, and when the number of main components was greater than 8, the PRESS value would increase. It indicated that the number of principal components was redundant and extra factors were considered in the establishment of models. So the optimal number of principal components was determined to be 8. As comparison, the cross-validation method was also used to ascertain the number of significant factors (latent variables) in the PLS algorithm, which left out one sample at a time. The PRESS was calculated in the same manner each time a new factor was added to the PLS models, and the -statistic was used to make the significance determination according to the suggestion of Haaland and Thomas . The maximum number of factors used to calculate the optimum PRESS was selected as 13 and the optimum number of factors obtained by the application of PLS model was 9 as a result.
3.2.4. Establishment of Regression Model by PLSR
Spectral data were manipulated by identifying the proper spectral region (1000–1240 nm) and choosing SNV + second-order derivative as appropriate pretreatment method, and then they were correlated with the data measured by the reference assays using PLSR method to develop calibration models . After the study of optimal number of principal components, the correlation diagram of reference and NIR prediction of ethyl 2-chloropropionate in related reaction process is shown in Figure 7. The correlation coefficients () of training set and calibration set are 0.9944 and 0.988, respectively. The performance of the calibration models is assessed in terms of root mean relative standard error of prediction (RMSEP). The calibration model with the high value and the low RMSEP is considered as the ideal model.
It can be found that the predicted model and the references of ethyl 2-chloropropionate in the involved process have a similar trend. However, the uncertain factors, such as the pulse of peristaltic pump, temperature, tiny particles, and external vibration, all have impact on the NIR spectra to different extent . Therefore, uncertain factors can lead to the slight difference of prediction and reference. The results of online NIR analysis and references can be also reflected by Figure 7. As a result, RMSEC of training set = 0.018105 mol/L (), and RMSEP = 0.036429 mol/L (). The above performance can indicate that the prediction model is acceptable as a method of quantitative determination.
3.3. Verification of the PLSR Model
On the basis of the above developed PLSR model, 21 batches of synthetic experiments have been used to verify the accuracy of the regression model. The product contents of NIR model and offline GC analysis are listed and compared in Table 4. It can be found that the difference of the two contents is small except for sample number 9, which could be the result of experimental error. After this unqualified sample is deleted, the average relative deviation can reach 1.12%, and average predicted recovery is 101.12%. The results in Table 4 can prove that the prediction ability of PLSR model is good for PAT analysis of ethyl 2-chloropropionate.
In the study, we investigated the synthetic process of ethyl 2-chloropropionate based on PLSR analysis and the use of NIR to achieve the online and noninvasive monitoring of the extraction process. The sampling, spectrum acquisition, and PAT system were developed and the established quantitative models can detect the dynamic change of target product content during synthetic process accurately and in real time. The origin spectrum was pretreated through first-order derivative, second-order derivative, MSG, SNV and SG smoothing, and so forth. The values of and RMSEC of PLSR quantitative calibration model indicated that it had good performance. Offline gas chromatographic analysis was used to validate the predictive ability of the model, and the results of average relative deviation and recovery rate in the validation set were ideal. In brief, NIR spectroscopy has potential to be extended to the whole reaction process, as a time-saving and continuous measuring method. And online NIR analysis technology was proved to be a fast and effective method to observe the extent of reaction successfully. As a fast, simple, and nondestructive analysis technology, online NIR detection has become a helpful tool applied in various synthetic reactions and industrial production.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Preparation of this paper was supported by the National Scientific Foundation of China (no. 81373284) and 2013 Scientific Research Foundation of Sichuan University for Outstanding Young Scholars.
U.S. Department of Health and Human Services, Food, and Drug Administration, Guidance for Industry: PAT—A Framework for Innovative Pharmaceutical Development, Manufacturing and Quality Assurance, U.S. Department of Health and Human Services, FDA, 2004.
Y. Wu, Y. Jin, Y. Li, D. Sun, X. Liu, and Y. Chen, “NIR spectroscopy as a process analytical technology (PAT) tool for on-line and real-time monitoring of an extraction process,” Vibrational Spectroscopy, vol. 58, no. 1, pp. 109–118, 2012.View at: Publisher Site | Google Scholar
H. S. Xiong, X. C. Gong, and H. B. Qu, “Monitoring batch-to-batch reproducibility of liquid-liquid extraction process using in-line near-infrared spectroscopy combined with multivariate analysis,” Journal of Pharmaceutical and Biomedical Analysis, vol. 70, pp. 178–187, 2012.View at: Publisher Site | Google Scholar
W. F. McClure, “Near-infrared spectroscopy: the giant is running strong,” Analytical Chemistry, vol. 66, no. 1, pp. 43–53, 1994.View at: Google Scholar
J. D. Kirsch and J. K. Drennen, “Near-Infrared Spectroscopy: applications in the analysis of tablets and solid pharmaceutical dosage forms,” Applied Spectroscopy Reviews, vol. 30, no. 3, pp. 139–174, 1995.View at: Publisher Site | Google Scholar
Z. S. Wu, Y. F. Peng, W. Chen et al., “NIR spectroscopy as a process analytical technology (PAT) tool for monitoring and understanding of a hydrolysis process,” Bioresource Technology, vol. 137, pp. 394–399, 2013.View at: Publisher Site | Google Scholar
S. S. Rosa, P. A. Barata, J. M. Martins, and J. C. Menezes, “Near-infrared reflectance spectroscopy as a process analytical technology tool in Ginkgo biloba extract qualification,” Journal of Pharmaceutical and Biomedical Analysis, vol. 47, no. 2, pp. 320–327, 2008.View at: Publisher Site | Google Scholar
N. M. Prinsloo, J. P. Engelbrecht, T. N. Mashapa, and M. J. Strauss, “Acetone to MIBK process optimization through multidisciplinary chemometrics and in-line NIR spectroscopy,” Applied Catalysis A: General, vol. 344, no. 1-2, pp. 20–29, 2008.View at: Publisher Site | Google Scholar
G. Lachenal, A. Pierre, and N. Poisson, “Application to the kinetic study of epoxy/triamine system (comparison with DSC and SEC results),” Micron, vol. 27, no. 5, pp. 329–334, 1996.View at: Google Scholar
J. P. Collman, Z. Wang, C. Linde, L. Fu, L. Dang, and J. I. Brauman, “A quantitative UV–VIS probe of enantioselectivity in metalloporphyrin catalyzed oxygenation using aluminium model complexes,” Chemical Communications, vol. 18, pp. 1783–1784, 1999.View at: Google Scholar
H. Martens and M. Martens, “Modified Jack-knife estimation of parameter uncertainty in bilinear modelling by partial least squares regression (PLSR),” Food Quality and Preference, vol. 11, no. 1-2, pp. 5–16, 2000.View at: Publisher Site | Google Scholar
H. R. Cederkvist and B.-H. Mevik, “Mean squared error of prediction (MSEP) estimates for principal component regression (PCR) and partial least squares regression (PLSR),” Journal of Chemometrics, vol. 18, no. 9, pp. 422–429, 2004.View at: Publisher Site | Google Scholar
J. Rodrigues, A. Alves, H. Pereira, D. D. S. Perez, G. Chantre, and M. Schwanninger, “NIR PLSR results obtained by calibration with noisy, low-precision reference values: are the results acceptable?” Holzforschung, vol. 60, no. 4, pp. 402–408, 2006.View at: Publisher Site | Google Scholar
N. Shetty and R. Gislum, “Quantification of fructan concentration in grasses using NIR spectroscopy and PLSR,” Field Crops Research, vol. 120, no. 1, pp. 31–37, 2011.View at: Publisher Site | Google Scholar
H. S. Guo, J. M. Chen, T. Pan, J. Wang, and G. Cao, “Vis-NIR wavelength selection for non-destructive discriminant analysis of breed screening of transgenic sugarcane,” Analytical Methods, vol. 6, no. 21, pp. 8810–8816, 2014.View at: Publisher Site | Google Scholar
W. Wu, Q. Guo, D. Jouan-Rimbaud, and D. L. Massart, “Using contrasts as data pretreatment method in pattern recognition of multivariate data,” Chemometrics and Intelligent Laboratory Systems, vol. 45, no. 1-2, pp. 39–53, 1999.View at: Publisher Site | Google Scholar
N. Dupuy, O. Galtier, Y. Le Dréau, C. Pinatel, J. Kister, and J. Artaud, “Chemometric analysis of combined NIR and MIR spectra to characterize French olives,” European Journal of Lipid Science and Technology, vol. 112, no. 4, pp. 463–475, 2010.View at: Publisher Site | Google Scholar
A. J. A. Santos, O. Anjos, R. Simões, J. Rodrigues, and H. Pereira, “Kappa number prediction of Acacia melanoxylon unbleached kraft pulps using NIR-PLSR models with a narrow interval of variation,” BioResources, vol. 9, no. 4, pp. 6735–6744, 2014.View at: Google Scholar
H. Q. Yang, “Nondestructive prediction of optimal harvest time of cherry tomatoes using VIS-NIR spectroscopy and PLSR calibration,” Advanced Engineering Forum, vol. 1, pp. 92–96, 2011.View at: Google Scholar
D. M. Haaland and E. V. Thomas, “Partial least-squares methods for spectral analyses. 1. Relation to other quantitative calibration methods and the extraction of qualitative information,” Analytical Chemistry, vol. 60, no. 11, pp. 1193–1202, 1988.View at: Publisher Site | Google Scholar
N. Wang, X. X. Zhang, Z. Yu, G. Li, and B. Zhou, “Quantitative analysis of adulterations in oat flour by FT-NIR spectroscopy, incomplete unbalanced randomized block design, and partial least squares,” Journal of Analytical Methods in Chemistry, vol. 2014, Article ID 393596, 5 pages, 2014.View at: Publisher Site | Google Scholar
V. A. McGlone, D. G. Fraser, R. B. Jordan, and R. Künnemeyer, “Internal quality assessment of mandarin fruit by vis/NIR spectroscopy,” Journal of Near Infrared Spectroscopy, vol. 11, no. 5, pp. 323–332, 2003.View at: Publisher Site | Google Scholar