Aflatoxin B1 (AFB1) contamination in peanut oil brings about a significant threat to human health. A method based on Fourier transform near-infrared (FT-NIR) spectroscopy was developed for qualitative and quantitative analysis of AFB1 contamination in peanut oil. A total of 94 samples were collected in the transmission mode and processed by a derivative and smoothing filter. Principal component analysis (PCA), discriminant analysis (DA), and partial least squares regression (PLS) were applied to establish the qualitative and quantitative analysis models. It was demonstrated that the qualitative model could distinguish effectively between the positive and negative samples with identification accuracy up to 100%. The correlation coefficient (R2), the root mean square error of calibration (RMSCE), and the relative percent deviation (RPD) for the quantitative model were 0.951, 3.87%, and 4.52, respectively. There was a good linear relationship between the predicted and reference concentrations of the samples with a significant correlation coefficient of 0.981. The qualitative and quantitative analysis models developed in this work may provide reference for researchers engaged in nondestructive testing of food and agricultural products.

1. Introduction

Peanut is an important commercial and oil crop in China. The yield of peanut ranks first among the oil crops in China, which accounts for 40% of the world’s total output [1]. Peanut oil is the main edible oil consumed by Chinese residents. The preparation of peanut oil includes many processing technologies, such as physical pressing and solvent extraction [2, 3]. Peanut oil contains more than 80% unsaturated fatty acids [4]. The content of trace element zinc is 8.48 mg/100 g which is the largest concentration among all kinds of edible oils. In particular, many consumers like to pursue the natural peanut oil that is produced by a simple physical pressing process and retains the fragrance to the greatest extent. However, mycotoxins and other harmful substances may exist in the natural peanut oil because the refining process is not applied for the production [5, 6]. For example, in 2018, the aflatoxin concentration in peanut oil in Guangdong province was found to be higher than the limit stipulated by the National Standard of China [7]. A similar problem also has been studied by Qin et al. in 2021 [8]. Aflatoxin contamination is one of the most risk factors to be solved for grain and oil crops in the process of consumption and export, which is a worldwide problem concerned by various countries. Aflatoxin B1 (AFB1) is the most toxic and widely distributed mold of aflatoxin. The toxicity of AFB1 includes carcinogenic, teratogenic, and mutagenic effects, which is 10 times higher than potassium cyanide and 68 times higher than arsenic. AFB1 was identified as a level 1 carcinogen by the International Agency for Research on Cancer (IRAC) [9, 10]. Therefore, it is of great significance to explore a nondestructive, rapid, and accurate detection method for AFB1 in peanut oil from both theoretical and practical aspects.

Traditional techniques for the detection of AFB1 include thin-layer chromatography (TLC), high-performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA) [11, 12]. These methods are not applicable to large-scale detection due to the defects of intricate sample pretreatments and consumption of time. In comparison, rapid detection methods like hyperspectral imaging, infrared spectroscopy, and electronic nose have attracted more and more attention [1315]. Near-infrared spectroscopy (NIS) has been used widely in the detection of agricultural products and food because of its advantages of needless sample pretreatments, rapid, needless consumption of chemical reagents, and simultaneous multiple component detection [1620]. NIS is applicable to most types of organics. It measures the overtone and overtone combination of molecular vibration which is related tocomposition and structure of organics [21]. In terms of food analysis, NIS has been applied for rapid detection of mildew in corn, peanut, Chinese chestnut, and fresh jujube [2225]. A support vector machine regression model was established to predict the total number of mildew in rice based on NIS. It was found that a certain linear relationship existed between the predicted and reference concentration of the total number of mildew with a correlation coefficient of 0.905 [26]. Principal component analysis (PCA) and K-nearest neighbor (KNN) pattern recognition methods were applied to qualitative analysis of germinated peanuts with an identification accuracy of 100% [27]. The rapid analysis of mildew in peanuts was developed by Liu et al. based on the spectral information of NIS and mid-infrared spectroscopy. The qualitative and quantitative models were established via multivariate statistical analysis [23].

In the past, the detection of aflatoxin by near-infrared spectroscopy was mainly focused on granular agricultural products such as rice, corn kernels, and peanuts. However, the contamination is random, and the distribution of aflatoxin may be uneven. In this study, the qualitative and quantitative models were established for the analysis of AFB1 contamination in peanut oil, which was aiming to lay a foundation for rapid and large-scale detection of AFB1 content in peanut oil.

2. Materials and Methods

2.1. Material

Peanut (No. 1 of zhongkaihua) was sterilized under irradiation generated by using the 15 kGy cobalt 60 lamp. The water content of the peanut was adjusted to 20%. The sterilized peanut was stored at 4°C before use. Aspergillus flavus was inoculated in potato dextrose agar (PDA) medium and cultured in a constant temperature incubator at 30°C. A large number of spores were produced after 14 days. The solution of spores was prepared for later use.

2.2. Methods

The sterilized peanut was soaked in the solution of spores and stirred for 5 s. It was transferred into an incubator with constant temperature and humidity. The temperature and humidity were 30°C and 80%, respectively, which facilitated the growth of aflatoxin. This process lasted for 18 days. Sampling started from the third day. Eighty grams of peanut were weighed for each sample, and four samples were taken each day. Sixty-four samples were obtained at last (No. 1–64). The positive peanut oil was obtained by a hot-press approach that was used in the small oil mills. Ten milliliters of the supernatant was taken for the following analysis. Eight milliliters of them was used for the determination of reference AFB1 concentration via the enzyme-linked immunosorbent assay (ELISA), and the remaining 2 milliliters was utilized for near-infrared characterization. The negative samples in which no AFB1 was detected by the ELISA include four kinds of commercially available peanut oil from supermarkets and one kind of qualified oil from the sterilized peanut (No. 1 of zhongkaihua). Six samples were taken from each kind of the qualified oil to give a total of 30 negative samples. Ninety-four samples involving both positive and negative samples were obtained for analysis.

Sample spectrograms were collected by NICOLET IS 10 Fourier transform near-infrared spectrometer (Thermo Nicolet, USA). The samples were placed in a square quartz cuvette with an optical path of 2 mm. The spectroscopic analysis was performed using a transmission analysis module with the built-in background of the instrument as a reference. The scanning range was from 12 000 to 4 000 cm−1 with a resolution of 8 cm−1. The scanning times and gain were 64 and 4, respectively. The spectrum of the background was collected before each sample. Five spectra were taken for each sample to give 320 spectra of positive samples and 150 spectra of negative samples. After spectroscopic scanning, the reference AFB1 concentrations of positive samples were detected by the ELISA according to the National Standard of China (GB/T 5009.22-2016) [28]. These reference AFB1 concentrations were used for the model establishment.

2.3. Data Processing

Many data-processing methods were applied to eliminate the influence of high-frequency random noise, baseline shift, and sample heterogeneity. These methods include the derivative smoothing filter (Norris derivative filter, Nd), the convolution smoothing filter (Savitzky–Golay, SG), the first derivative (1st D), and the second derivative (2nd D). More useful information could be obtained after pretreatment. In some cases, combinatorial use of the above methods was selected for the best data-processing result.

2.4. Model Establishment and Evaluation

The data were processed and modeled using Ominic 9.0 software and TQ Analyst 9.0. Chemometrics was used to analyse the spectra and reference AFB1 concentrations. Discriminant analysis (DA) is a supervised pattern recognition method based on the class model combining principal component analysis (PCA) and the Mahalanobis distance method. PCA is a multivariate statistical method for dimensionality reduction, which can obtain the best characteristic for the description of the sample. The Mahalanobis distance is calculated from the principal component scores. The classification is based on the Mahalanobis distance between the unknown and known samples. The discriminant distance between the unknown and known samples must be less than 3 to pass the judgment in this work. The quality of the model is evaluated based on the performance index, the effect of cluster analysis for the three-dimensional graph, the recognition accuracy of the model for the external verification samples, and the error rate. Therefore, the principal components of FT-NIR spectra of peanut oil infected by AFB1 were firstly extracted by PCA, and then, DA was used to qualitatively distinguish whether it was infected or not.

Partial least squares (PLS) regression is a classical linear modeling method, which compresses spectral data into an orthogonal structure of potential variables and describes the maximum covariance between the spectral information and reference content value. Compared with the traditional multiple linear regression, PLS has the advantages of comprehensive screening of spectral data, full extraction of effective information of the sample spectrum, and considering the internal relationship. The model established by PLS can identify information more accurately.

The determination coefficient (R2), root mean square error (RMSE), and relative percent deviation (RPD) of the modeling set are used to evaluate the estimation model of AFB1 in peanut oil. The model displays good performance when the values of R2 and RMSE are closer to 1 and 0, respectively. The model can be used for process control or quality control if the value of RPD is greater than 3 [29]. The model has excellent predictive ability when the value of RPD is greater than 4. The robustness of the evaluation model was verified externally [30].

3. Results and Discussion

3.1. Analysis of Reference Aflatoxin B1 (AFB1) Concentration

The reference AFB1 concentrations of 64 positive samples were determined by the enzyme-linked immunosorbent assay (ELISA) (Table 1). According to the National Standard of China (GB2761-2011), the AFB1 concentration in peanut oil must be lower than 20 μg/kg [8]. The AFB1 concentrations of positive samples were in the range of 4.44–38.26 μg/kg, which possessed good gradient distribution. The FT-NIR calibration model based on these samples might display excellent applicability.

Forty-nine positive samples were selected as the calibration set via the concentration gradient method to establish the model. The remaining 15 samples were assigned as the external validation set to evaluate the established model. The statistics of all positive samples are shown in Figure 1. The distribution range of the calibration set is larger than that of the external validation set, indicating that the data are representative.

3.2. Near-Infrared Spectra of the Positive Samples

Figure 2 shows the near-infrared spectra of all positive samples in the range of 10000–4300 cm−1. It can be seen that most of the spectra had similar features, such as multiple absorption peaks and rich spectral information. Figure 3 presents the comparison of average spectra between positive and negative samples. The spectra of the two samples displayed many common characteristic peaks and slight differences. As it can be seen, the overall absorbance value of the positive samples was slightly larger, which might be due to the changes in content of protein and fat in peanuts contaminated with AFB1, leading to variation in the spectrum of peanut oil. These spectral changes could indicate the content of AFB1 indirectly [31]. In addition, the peaks greater than 9000 cm−1 could be attributed to the frequency-doubling stretch vibration of O-H in water. These peaks were not appropriate for the model establishment on account of several interferences [32]. Therefore, the peaks in the range of 9000–4300 cm−1 were selected for analysis.

The first derivative of the original spectrum was processed to extract the characteristic bands [26]. As shown in Figure 4, multiple characteristic absorption peaks in the near-infrared region were observed via the first derivative treatment, suggesting that the near-infrared spectrum could reflect the differences in chemical composition between samples. The peaks in the range of 8400–8100 cm−1 could be attributed to the first order frequency-doubling stretching vibration of -CH2 and -CH3 groups, while the peaks in the range of 7400–6900 cm−1 and 6000–5670 cm−1 could be assigned to stretching vibration of C-H bond of the aromatic ring and the second order frequency-doubling stretching vibration of the -CH2 group in fatty acid, respectively [23]. In addition, the peaks in the range of 4770–4520 cm−1 were corresponding to the combined frequency of the second order frequency-doubling stretching vibration of the C = O group in esters and stretching vibration of the N-H group in amino acids [26]. These characteristic absorption bands were related closely to the AFB1 contamination, indicating that moisture, carbohydrate, and protein affected AFB1 content in peanuts. Thus, the above characteristic bands were selected to establish the models.

3.3. Qualitative Model
3.3.1. Model Establishment

The DA discriminant model was established by using the top 10 principal component scores extracted from PCA. Twenty-five negative and fifty-nine positive samples were used for modeling. The remaining samples were used for external validation. Several data-processing methods were applied for the near-infrared spectra (Figure 5). The combinatorial method involving both the first derivative (1st D) and the Norris derivative filter (Nd) was demonstrated to be the most efficient method with the highest performance index (97.1).

The DA model displayed an excellent differentiating effect between positive and negative samples (Figure 6). PCA scores showed that the samples were divided into two main clusters, which were related to the content of aflatoxin in the samples. The contribution rates of PC1, PC2, and PC3 were 89.87%, 8.07%, and 0.88%, respectively. The total contribution rate was 98.82% (Figure 7). PCA loadings showed the PCA results of sample spectra in different characteristic ranges (Figure 8). Changes were observed in the ranges of 4770–4520 cm−1, 6000–5670 cm−1, 7400–6900 cm−1, and 8400–8100 cm−1, indicating that there was a certain relationship between aflatoxin and the components of positive samples. The spectral information may be affected by these factors and show a clustering trend.

3.3.2. Model Evaluation

Five positive and five negative samples that did not participate in the modeling were imported into the DA model to verify the predictive power of the model. The Mahalanobis distance between the unknown and known samples must be less than 3 to pass the judgment. As it can be seen in Table 2, the model was effective in detecting the 10 samples with a recognition accuracy of 100%. The reason for this excellent predictive power might be that the resolution of spectral data and sensitivity of the model were improved when the data-processing method of the 1st D plus Nd filter was employed to reduce the interference from random noise and the baseline shift. This model possessed high discrimination and robustness, which could meet the requirements of rapid and accurate identification.

3.4. Quantitative Model
3.4.1. Selection of Data-Processing Methods

The data-processing methods used in the qualitative model also were applied in the establishment of the quantitative model. The applicability of the data-processing methods was evaluated by the parameters of the quantitative model that was established based on the partial least squares (PLS). In addition, more useful multivariate information could be extracted via data-processing of spectra, conducing to the establishment of a more effective model. As shown in Table 3, the performance indices of the quantitative models established by different data-processing methods were all greater than 0.9. The RMSE was also very small. The spectral resolution could be improved by using the first or second derivative alone. However, some useful information related to AFB1 might be lost, leading to a poor model. The precision of the model could be promoted by the combination of the derivative with SG or Nd because they could improve the signal-to-noise ratio and reduce random noise. Compared with SG smoothing, the Nd smoothing plus the first derivative gave the best quantitative model. The R2, RMSEC, and RPD were 0.951, 3.12%, and 4.52, respectively, suggesting that the model had excellent predictive ability. Therefore, the Nd smoothing plus the first derivative was determined as the optimal data-processing method.

3.4.2. Model Establishment and Evaluation

The corrected correlation coefficient R2, the predicted phase relation number Rp2, and the cross-validation correlation coefficient Rcv2 were 0.951, 0.981, and 0.920, respectively, (Figures 9 and 10). The corrected mean variance (RMSEC), sample predicted mean variance (RMSEP) and cross-validation mean variance (RMSECV) were 3.12%, 2.57%, and 3.87%, respectively. The external verification of 15 positive samples manifested that a certain linear relationship existed between the predicted and reference concentrations of the samples with a correlation coefficient of 0.981 (Figure 11), indicating excellent predictive power of the model.

The results of external verification are shown in Table 4. It can be seen that the maximum absolute deviation between the predictive and reference AFB1 concentrations was 2.62 μg/kg, while the minimum was 0.03 μg/kg. The absolute value of relative deviation was 0.22%–7.64%, which fluctuated remarkably. The reason might be that the content of AFB1 was very low in the sample, resulting in large deviation accompanied by minor data fluctuation. According to the previous report [33], the accuracy of the model could be improved obviously by increasing the number of samples when the accuracy of experimental data was poor and the sample amount was small. As shown in Table 4, the relative deviation was lower than 10%, which could meet the requirements of rapid detection and provide a reference for routine analysis. In addition, the paired t-test was applied to estimate the difference between the methods of near-infrared spectroscopy and the enzyme-linked immunosorbent assay. The result showed that the two methods had no significant difference with the value of 0.991.

4. Conclusions

The qualitative and quantitative analysis models of AFB1-contaminated peanut oil were established based on the FT-NIR and chemometrics. The results indicated that the identification accuracy of the qualitative model was 100%. The correlation coefficient and the root mean square error of the calibration set were 0.951 and 3.12%, respectively, for quantitative analysis. The value of RPD reached 4.52, and the correlation coefficient between the predicted and reference AFB1 concentrations was 0.981, indicating the excellent predictive ability. The results demonstrated that FT-NIR could be applied for both rapid identification and quantitative analysis of AFB1 in peanut oil. In addition, the near-infrared spectroscopy method might be developed as a nondestructive and large-scale method for the detection of aflatoxin in peanut oil if a larger number of natural samples are included in the model.

Data Availability

The raw data used in our research cannot be shared at this time as the data also form the part of an ongoing study.

Conflicts of Interest

The authors declare no conflicts of interest.


The authors thank the Science and Technology Program of Guangdong Province (2020B121201013), South China Agricultural University and Jiaying University-Universities Serving Rural Revitalization Community. Guangdong Provincial Key Laboratory of Conservation and Precision Utilization of Characteristic Agricultural Resources in Mountainous Areas, Meizhou Rural Science and Technology Correspondent Team Project, 2020 (202010582014) and 2021 (202110582110) Innovation and Entrepreneurship Training Program for College Students, and 2021 “Climbing Plan” Special Fund Project of Guangdong Science and Technology Innovation Strategy (202101) for financial support.