Abstract

Fourier transform infrared (FTIR) spectroscopy has been advocating a promising alternative for Karl Fischer titration method for quantification of moisture in oil. This study aims to integrate partial least squares regression (PLSR) approach on FTIR spectra for prediction of moisture in locally accessible transformer oil and lubricating oil. The oil samples spiked with known moisture concentrations were extracted with acetonitrile and subjected to analysis with an FTIR spectrophotometer. The PLSR model was built based on 100 training/test splits, and the prediction performance was measured with the percentage root mean squares error (% RMSE). The range of concentration studied was between 0 and 5000 ppm. The marker region of moisture was found at 3750–3400 and 1700–1600 cm−1 with the latter demonstrating a better predictive ability in both lubricating oil and transformer oil. The prediction of moisture in lubricating oil was characterized with lower % RMSE. At concentration less than 700 ppm, the prediction accuracy deteriorates suggesting poor sensitivity. The PLSR was implemented on IR spectra of a set of blind samples, verified with Karl Fischer (for transformer oil) method and Kittiwake (for lubricating oil) method. The prediction was encouraging at concentrations above 1000 ppm; at lower concentrations, the prediction was characterized with high percent error. The algorithm, validated with 100 training/test splits, was converted into an executable program for prediction of moisture based on FTIR spectra. This program can be used for prediction of other substances given that the marker region is identified. FTIR can be used for prediction of moisture in oil nevertheless the sensitivity and precision is low for samples with low moisture concentration.

1. Introduction

Moisture analysis is a routine monitoring activity for utility companies. The presence of moisture in transformer oil and lubricating oil will lead to break down of transformer and machinery. In transformer oil, moisture reduces its dielectric strength whilst in lubricating oil, it affects the oil viscosity causing corrosion to the machinery. Conventionally, the moisture in oil is determined using the Karl Fischer titration method. The method demonstrates sensitivity as low as 10 ppm; however, it is expensive involving various solvents and is time consuming [1, 2].

Fourier transform infrared (FTIR) spectroscopy has been advocating a promising alternative for quantification of moisture in oil, integrating partial least squares regression (PLSR). This technique has been employed for lubricating oil [37], transformer oil [8, 9], fuel oil [10], turbine oil [11], and biodiesel [12] with promising sensitivity as low as 50 ppm [6]. The advantage of FTIR is that it allows rapid analysis with minimal sample preparation and is inexpensive. It also permits on-site monitoring for application of utility oil nevertheless the quantification is oil-specific, requiring individualised calibration.

In this study, we employ the PLSR approach on FTIR spectra for prediction of moisture in locally accessible transformer oil and lubricating oil. The model was validated based on exhaustive training/test splits and was applied on a set of blind samples, verified with Karl Fischer and Kittiwake method. The prediction algorithm was converted into an executable program that runs on Windows. The exhaustive training/test split strategy is new in this area of application to testify and verify the prediction avoiding overfitting of a model based on one single dataset.

2. Materials and Methods

2.1. Sample Preparation

A sample of transformer oil (Hydrax Hypertrans HR) and lubricating oil (Shell Turbo T68) provided by Sarawak Energy Berhad was used to develop the model. The calibration model was established based on oil samples spiked with known concentration of moisture, extracted with acetonitrile. Prior to sample preparation, the oil was left in dried molecular sieve with pore size of 4 Å for three days to remove the moisture present. Karl Fischer and Kittiwake methods suggest that the saturation level of moisture was <10 ppm for transformer oil and <500 ppm for lubricating oil. Note that the molecular sieve was dried in a furnace at 325°C for 24 hours before use.

2.2. Sample Analysis

Ten millilitres of the treated oil samples was transferred to centrifuge tubes and spiked with distilled water to attain moisture concentrations at varying levels (0, 500, 700, 1000, 2000, 3000, 4000, and 5000 ppm). The concentration covers a wide range aiding to examine the sensitivity of the method where the standards were prepared by gravimetric addition of water to oils. They were vortexed and extracted with 10 mL of dried acetonitrile for 1 min, respectively. The samples were continued to be centrifuged for 10 mins at 7500 rpm to allow separation. The solvent layer was transferred to septum-capped vials for analysis with Fourier transform infrared (FTIR) spectrophotometer equipped with an ATR (Agilent 4500 Series FTIR). All spectra were obtained at a resolution of 4 cm−1 and 64 scans. The oil, both lubricating oil and transformer oil, spiked with moisture and extracted with acetonitrile was scanned between 4700 and 590 cm−1 in five replicates yielding a total of 40 spectra, respectively. The spectra in ∗.spc format were saved.

2.3. Model Development

The PLS regression model was developed in Matlab R2013a. The spectra in ∗.spc format were converted into ∗.mat, readable in Matlab. The interested IR region was determined, and the spectral data were extracted, X (M × N). The corresponding concentration is the response in a vector, y (M × 1). The spectral data were split into training and test sets. Two-thirds of the samples was assigned as the training set to build the model for prediction whilst the remaining serves as the test samples to validate the model. The training data, X (M × N), were standardized and the response, y (M × 1), was mean-centred before subjected to PLS algorithm. Note that in this study, two PLS components were used. The test set, Xtest (Mtest × N), was likewise standardized using the mean and standard deviation of the training samples. The PLS algorithm assumes a linear relationship between the predictor, X, and the response, y. They are decomposed into models of X = T · P + E and y = T · q + f where E and f are the noise; T is the score’s matrix common for X and y; P and q are the loading matrices. The PLS algorithm is referred to Brereton [13] for brevity.

2.4. Model Evaluation

The prediction model was validated based on 100 training/test sets with the percentage root mean square error (% RMSE) calculated as follows. The flow chart in Figure 1 illustrates the prediction of moisture in oil with PLSR. The model was programmed in Matlab R2013a, transformed into graphical user interface (GUI) and converted into an executable program.

The algorithm was then applied on a set of blind samples with the prediction compared against the measurements attained using Karl Fischer (for transformer oil) and Kittiwake (for lubricating oil) methods. The blind samples were independent test set spiked with known concentration of moisture. Analysis of variance (ANOVA) was performed to evaluate the % RMSE attained based on different spectral regions over 100 training/test splits to determine if there is a significant different at 95% confidence level.

3. Results and Discussion

3.1. Lubricating Oil

Figure 2 shows the IR spectra of acetonitrile with moisture at varying concentrations, extracted from lubricating oil. The spectra overlap perfectly except at the regions typically designated for the absorption of OH groups at 3750–3400 cm−1 and 1700–1600 cm−1. The band intensity increases as the concentration increases where these regions have been commonly used for prediction of moisture in oil. At lower concentrations of 500 and 700 ppm, the peak intensities are closely similar. Under the influence of spectral variability, the concentrations may be confused suggesting that the detection of moisture at this level can be challenging.

Both regions of 3750–3400 and 1700–1600 cm−1 were subjected to evaluation to determine their predictive ability based on 100 training/test sets incorporating PLSR. The prediction performance was measured based on the average % RMSE over 100 iterations. With 100 training/test splits, the model is assured free of bias, and the prediction is not fitted based on one single dataset. Figure 3 shows the predicted versus expected concentrations using spectral data at 37500–3400 and 1700–1600 cm−1. The average % RMSE of 100 training/test sets suggests that the region at lower frequency demonstrates statistically better prediction accuracy ().

This observation opposes the finding of Ng and Mintova [4] where the moisture was extracted using DMSO; the best prediction was reported using the spectral at 5400–4800 cm−1 followed by 3800–3200 cm−1. The region at 1800–1500 cm−1 was inferred with the lowest accuracy as a result of interference from aminic, phenolic additives and other oxidation products present in lubricating oil. On the contrary, Meng et al. [14] corroborate the finding of this study, concluding that the absorption at 1630 cm−1 noticeably experiences fewer interferences compared to the OH stretching region at 3400 cm−1. Van de Voort et al. [5] however recommends the absorption at 3676 cm−1 implying that this frequency is less affected by the interference of phenol antioxidant. Note that both Meng et al. [14] and Van de Voort et al. [5] apply the solvent extraction strategy using acetonitrile. Overall, both regions have been widely used for quantification of moisture in oil whether directly or indirectly as summarized in Table 1. The choice of optimal water band may differ as a result of matrix interference and the type of oil studied.

The dataset with moisture concentrations ranging between 0 and 5000 ppm were divided into two subsets of low (0, 500, 700, and 1000 ppm) and high (0, 1000, 2000, 3000, 4000, and 5000 ppm) concentrations to evaluate the method sensitivity. It is anticipated that both subsets shall yield comparable % RMSE if the sensitivity is not compromised. Table 2 summarizes the average % RMSE of training/test sets, for two subsets of low and high moisture concentrations, according to spectral data at 1700–1600 and 3750–3400 cm−1. The average % RMSE is higher for subset of low concentrations, indicative of poorer prediction.

To determine the model sensitivity, the calibration samples were subjected to self-prediction as tabulated in Table 3. The concentration at 700 ppm was detected reasonably accurate hence the limit of detection (LOD) is recommended at 700 ppm. Nevertheless, the concentration at which the moisture can be comfortably detected, in another words the limit of quantification (LOQ), is suggested at 1000 ppm. This recommendation is in line with the literature findings where Dong et al. [3] propose the LOD at 500 ppm whilst Holland et al. [19] postulate at 1000 ppm, agreeing with Fitch [20]. In terms of precision, it is found that the prediction is subjected to high variance, indicative of unstable models. According to Shang et al. [21], ATR-FTIR suffers limitations of short path length and weak signal for quantitative analysis, rendering less accurate and inconsistent measurements. FTIR has been employed for quantification of moisture in oil with sensitivity ranging from as low as 30 ppm to 13000 ppm (Table 1); this broad variation is hypothetically attributed to the types of oil examined, sample preparation, analytical procedure, and validation strategies. In this study, acetonitrile extraction is applied incorporating the exhaustive splitting of training and test sets to verify the prediction model.

The model was further applied on a set of blind samples where the prediction was compared and verified with the Kittiwake moisture sensor. Table 4 shows the moisture concentrations predicted with PLSR using the spectral data at 1700–1600 cm−1 and the measurements attained using Kittiwake method. The accuracy and precision of the results are evaluated based on the percentage error (% error) and the percentage coefficient of variation (% CV) . At concentrations lower than 800 ppm, the prediction with PLS regression is inconsistent and inaccurate; however, the prediction is seen to improve as the concentration increases. There is a marked difference between the predicted concentrations of PLSR and the measured concentration of Kittiwake. Note that the recommended operating range for the Kittiwake sensor is between 0 and 3000 ppm with sensitivity of 500 ppm. At a concentration between 500 ppm and 800 ppm, the Kittiwake measurement is satisfactory with an error of 5–8%; however, as the concentration increases (>800 ppm), the method consistently records higher error than the prediction of PLSR.

3.2. Transformer Oil

Figure 4 shows the spectra of acetonitrile containing moisture extracted from transformer oil. The marker regions were likewise identified at 3750–3400 and 1700–1600 cm−1 where the spectral data were singled out for prediction according to training/test sets over 100 iterations.

Figure 5 shows the predicted versus expected concentrations of training/test sets where the spectral data at 1700–1600 cm−1 demonstrate better predictive ability than that at 3400 cm−1 with statistical significance (). The prediction of moisture in transformer oil is seemingly less promising compared to lubricating oil. The average % RMSE for test sets of transformer oil and lubricating oil is 16.55% and 28.63%, respectively. The prediction performance may be oil dependent, governed by the formulation and additives. Essentially, oils dissolve some water with their saturation point governed by the amount of additives. Transformer oil is a mineral oil with minimal additives; it can be saturated with 3–10 ppm of water whilst lubricating oils saturate at higher moisture level depending on the oil type (hydraulic fluids 100–1000 ppm; industrial lubricating oil 600–5000 ppm; automotive lubricating oil 1000–5000 ppm; stern tube oil >16%) [22].

The dataset was likewise divided into two subsets of lower and higher concentration range to examine the prediction performance. Like lubricating oil, the prediction for samples at lower concentrations is characterized with a greater % RMSE of 60.75% (test samples) whilst at higher range, the prediction improves with % RMSE of 21.40% (Table 5). The prediction similarly exhibits better accuracy at concentrations above 1000 ppm as suggested by the self-prediction results of transformer oil (Table 6). Table 7 summarizes the prediction of moisture in blind samples, verified with Karl Fischer method. The corresponding % error and % CV are included in the table. The prediction with PLSR is seen to suffer at concentrations <1000 ppm with large error between 29 and 66%. As the moisture concentration increases, the prediction accuracy improves with percent error of 3–14%. For Karl Fischer method, the measured water against the expected concentrations fluctuate inconsistently and unpredictably with high prediction error and variance. Coulometric Karl Fischer method is ideal for detection of moisture at low concentration of 10 µg to 100 mg; at increased concentration, Margolis [23] reports reduced measured water when the oil becomes insoluble in the solvent.

This study reports the quantification of moisture using the indirect strategy of acetonitrile extraction. The prediction was also attempted using the spectra of oil, directly spiked with moisture; unfortunately the prediction was erroneous and irreproducible likely due to scattering of infrared light as a result of inhomogeneous water globules [15]. A surfactant stabilizer was added to reduce the scattering aiding to enhance the prediction; however, no observable improvement was evidenced (results are not included).

3.3. Prediction Program for Moisture in Oil

The algorithm was converted into graphical user interface (GUI) and compiled into an executable file that can run on Windows. In this program, users provide the calibration spectra with known moisture concentrations for evaluation according to 100 training/test sets. The program can then be executed to predict the moisture in unknown samples. Figure 6 shows the interface and output of the program for prediction of moisture in oil. The program can be used to predict other substances given that the marker region is identified.

Essentially, water is present in oil as dissolved, emulsified, or free water. At a low concentration, water is able to disperse in oil; however, as the level increases, it becomes immiscible. The saturation level depends largely on the type of base oil, additive, temperature, and pressure. The dissolved water may be present at a concentration less than 2000 ppm, whilst emulsified and free water range from 150–5,000 ppm and 500–50,000 ppm, respectively [24]. In this study, the concentration of moisture examined is mostly present in the state of emulsified and free water. For concentration lower than 1000 ppm, the method is not sufficiently sensitive and may not be useful for moisture monitoring in utility oil. Typically, the allowable moisture content in in-service transformer oil is less than 35 ppm [25]. This implies that the prediction of moisture in transformer oil with FTIR is unsuitable. For lubricating oil, the desired level of moisture may vary depending on the machine’s specification and operation. In some machines, a small amount of water can be destructive. Normally, the moisture in lubricating oil is maintained below 0.03%, and a concentration of 0.15–0.20% can be damaging [26].

3.4. PLSR versus Simple Linear Regression (SLR)

The peak intensity is found to increase with the concentrations without sign of interference; the relationship suggests that the moisture is possibly predicted using simple linear regression (SLR). To testify, identical 100 training/test splits were subjected to PLSR and SLR using spectral area at 1600–1700 cm−1. Table 8 summarizes the average % RMSE of prediction using PLSR and SLR based on 100 training/test splits (concentration ranges 0–5000 ppm). Evidently, the prediction performance of PLSR and SLR is comparable with the prediction in lubricating oil demonstrating better accuracy.

4. Conclusion

FTIR integrated with PLSR is feasible for prediction of moisture in oil, demonstrated on lubricating oil and transformer oil. The spectral region at 1700–1600 cm−1 exhibited better predictive ability with lubricating oil revealing a lower % RMSE. The method unfortunately yielded high prediction error for samples with concentrations less than 700 ppm. Hence for monitoring of moisture in utility oil, which is typically low in concentration, the method is lacking in sensitivity. Karl Fischer and Kittiwake methods are intended to validate the FTIR method in detection of moisture; however, the results suggest that the three methods are not directly comparable for the reason that their optimum measuring range is different. Coulometric Karl Fischer method is commonly recommended for moisture level of a few ppm to <1-2% whilst Kittiwake method is found to work better at less than 1,000 ppm with detection limit of 500 ppm. For FTIR method on the other hand, the detection of moisture is suggested at >700 ppm with the solvent extraction strategy.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The authors thank Sarawak Energy Berhad for funding the project (E14051/F(07)/51/FTIR/2016(3)) and providing support for sample analysis with Kittiwake, Karl Fischer, and FTIR methods.