Abstract

Consumers concern about food adulteration. Pork meat is the principal adulterated species of beef and mutton. The conventional detection methods have their own limitations; therefore, we sought to develop an efficient and economical identification method using an infrared spectroscopy technique for meat. The Mahalanobis distance method was used to remove outliers in spectrum data. Interferences were eliminated using multiple scatter correction, standard normal variate, Savitzky-Golay smoothing, and normalization. The partial least square discriminant analysis (PLS-DA) and support vector machine (SVM) were used to establish identification models. In the Mahalanobis distance method, the coefficient of test sets was increased from 0.93 to 0.99; the RMSEC and RMSECV were decreased from 0.17 to 0.09 and 0.21 to 0.11 accordingly. The coefficient of determination in-between the calibration and testing sets in PLS-DA reached 0.99 and 0.99, RMSEC was 0.06, and both the RMSECV and RMSEP were 0.08. In contrast, in SVM, methods were 0.97 and 0.96. The RMSEC, RMSECV, and RMSEP were 0.15, 0.17, and 0.24, respectively. In summary, using a combination of infrared spectroscopy technology with PLS-DA was a better identification method than the SVM method that can be used as an effective method to identify pork, beef, and mutton meat samples.

1. Introduction

Consumers pay more and more attention to the food security, regarding the origin and content of the food they buy, for their best nutritive sources, taste, and flavor. Due to the demand of quality and the swelling in price, food adulteration is still quite common in some food products that include milk [1, 2], wine [3, 4], table oil [5, 6], tea [7], coffee [8], and meats [911]. For example, it has been testified that 20% of the tested livestock meat were assured of counterfeit in Guangdong Province, China, during 2014-2015 [12]. European criminal police organization (EUPOL) and international criminal police organization (INTERPOL) also have collaborated in monitoring food security, resulting in 2500-ton illegal and counterfeit food [13]. Normally, adulteration of meat might occur by substitution of low-priced or even banned meat species for that high-priced one, such as pork adulterated in the beef and mutton or other meat [911, 14]. Counterfeit of common livestock has caused serious social issues, which not only harms the interests of religious-concerned or allergic consumers but also deteriorates the credit of the enterprises [15]. For example, many Hindus do not eat beef, while Islamic and Jewish laws forbid the eating of pork meat [16].

Several methods exist for the determination of the origin of animal species in meat products, based on nucleic acid resources, also known as molecular techniques that has a number of strategies that include DNA barcoding [17], DNA finger printing [18], PCR assays [19], real-time PCR assay [20], random amplified polymorphic DNA (RAPD) [21], restriction fragment length polymorphisms (RFLP) [22], amplified fragment length polymorphism (AFLP) [23], PCR simple sequence repeat (PCR-SSR) [24], and tandem repeat marker assays [8]. The protein-based methods rely on immune assays or chromatography that includes gas chromatography (GC) [6], liquid chromatography (LC) [25], high-performance liquid chromatography (HPLC) [5], enzyme-linked immunosorbent assay (ELISA) [26], several blotting immunological assays [27, 28], and electrophoretic analysis [29, 30]. However, each technique has its own limitations, such as being expensive, laborious, inadequate, and time-consuming, diverse range of equipment, and difficulty to the interpretation of obtained results [10]. Therefore, there is an urgent need to establish a fast and reliable identification method to authenticate the meat species. In recent years, people have progressively focused onto spectroscopy techniques to overcome those limitations. Infrared spectroscopy is a kind of spectroscopic technique based on the interaction of infrared radiation with matter. It can be used to identify and quantify compounds which absorb frequencies that are characteristics of their structure such as molecular potential energy surfaces, the masses of the atoms, and the associated vibronic coupling. Due to the fast and simple operation of the infrared spectroscopy (FT-IR), it has been widely applied in the detection of agricultural products such as wine [3, 4], olive oil [31], tea [7], and meat [10, 13, 32, 33].

In this work, FT-IR technology was applied in dried and ground meat sample to achieve rapid, nondestructive detection of pork adulterated in the beef and mutton. The methods here we used vary with the sample analysis of data for efficient and economical identification with previous reports. In order to exclude the samples that had a negative impact on the model, the Mahalanobis distance method [34] was used effectively to eliminate the outlier samples in this study. Different preprocessing and modeling methods were also compared to obtain the best identification method. Through this article, we hope to provide a valuable solution to distinguish different meat which had a positive impact to maintain livestock product standards, to protect consumers and producers against fraudulent substitution of quality products in foods.

2. Materials and Methods

2.1. Sample Collection and Preparation

Fresh muscle samples from the hind leg and breast of each pork, beef, and mutton (a total of 180 samples) were purchased from the market in Guangzhou, Guangdong Province, China. In order to increase the diversity of the meat, the 180 samples were collected by 20 batches (9 samples in each batch including 3 samples of pork, beef, and mutton, resp.) from different stores and markets during Sep. to Nov. in 2015. The samples of pork were collected from ordinary pork and three brands of local pork (YiHao, AnKang, and Yao) which had significant differences in price and age. The ordinary pork was inexpensive, the feeding time was between 5 and 6 months, and the weights were between 95 and 110 kilograms. In contrast, local pork was more expensive, has longer feeding time (10–12 months), and weighs between 100 and 120 kilograms. When the mutton samples were purchased, a deliberate choice of different feeding places (including Qingyuan, Zhanjiang, and Hainan) was used to increase the representation of mutton samples. The difference between the muttons in different places was not large (about 8-9 months, 50–60 kilograms), whereas beef samples (18–24 months, 400–450 kilograms) were not identified with habitats or varieties. Each sample was immediately transported to the laboratory after buy, and the samples were separately sliced after the removal of the fat contents and subjected to dry at 45°C in the oven for 24 h. Thereafter, dried lean meat was ground into powder and sifted using a mesh screen before packing into a dry polyethylene wrap. The sample powders were kept in a sealed glass bottle at 4°C until used.

2.2. Infrared Spectroscopy

Preparation of samples to FT-IR has an important effect on the consistency of prediction. In particular, the lack of uniformity of the meat samples influences the precision of estimation. Meat sample powders were mixed with potassium bromide (KBr), at a ratio of 1 : 100 (w/w) before they were subjected to infrared spectroscopy analysis. The processed sample was analyzed under scanning under infrared spectroscopy PE Spectrum 100 (PerkinElmer, Massachusetts, USA). The parameters used for the scanning were the following: the range of wave number 4000–450 cm−1, resolution 0.4 cm−1, the temperature of the environment 25, and humidity 30 ± 5%. Each sample was scanned thrice, and the mean value was used as original spectral data. The environment was consistent during the test and the background spectrum was measured every 2 hours.

2.3. Methods
2.3.1. Mahalanobis Distance Method for Outlier’s Identification

The central problem was the interferences obtained during the spectral analysis of the selected samples; the outliers might be due to contaminants that existed in the samples, the instability of the machine, environmental disturbances, or the slip-up during sample preparations. The outliers may have a great impact on the stability and accuracy of the model. Therefore, these outliers must be removed by using the Mahalanobis distance method. In this report, we considered the following model: where the parameter is the adjacent sample number of spectral vector, is the average value of the spectral vector, is the covariance matrix, and is also named as the Mahalanobis distance (1). The signals that will be dismissed are those which are more different from the same meat sample (60 samples) via setting up a certain Mahalanobis distance.

As shown in (2), is the threshold, is the average value of the spectrum, is the adjustable parameter, and is the standard variation of the Mahalanobis distance.

In this report, δ = 3 was used to determine the threshold Mahalanobis distance; when the value of was higher than that of , the sample was considered outliers and excluded.

2.3.2. Methods for Model Building

The prediction accuracy strongly depends on the definition of similarity between results of the same samples at different times. To calculate the similarity, a partial least square (PLS) [35] model was used to build a local linear regression.

Normally, the PLS method is used for quantitative analysis, whereas partial least square discriminant analysis (PLS-DA) is widely used, which is usually derived from the PLS [36]. Since there is no corresponding matrix for concentration, therefore, the matrix value was artificially set up in this paper, so as quantitative analysis can be performed via the transformation by the partial least square method.

Support vector machine (SVM) [37] is a kind of machine learning arithmetic method based on the algorithms to data analysis, recognition model, classification, and regression analysis [38]. Since kernel function has been incorporated into the SVM, therefore, it can solve the inner products in the higher dimensional space delicately and solves the problem of nonlinear classification. The selection of proper kernel function is the key step for an optimal model building. In this report, radial basis function (RBF) [39] was chosen to map a sample to a higher dimensional space. Comparing with linear kernel function, RBF has a broader application, and it is more suitable for solving the problem by linear nonclassification. Several parameters can determine the quality of RBF. In this report, grid pattern searching technique [40] was used to choose the parameters of RBF. The residual sum of squares was calculated using leave-one-out cross-validation (LOOCV) [41] during the model building.

2.4. Spectral Data Processing

Obtained original spectral data were processed for multiplicative scatter correction (MSC) to adjust all observed data into the model spectrum to be used as a reference from the mean spectrum [42]. The standard normal variate (SNV) was used to remove the scattering effects caused by the different sized particles [43]. The Savitzky-Golay filter method was applied to smooth out a noisy signal data without distorting the signal [44]. Normalization makes transaction processing faster by using Unscrambler X 10.3 (CAMO Software, Oslo, Norway), with PLS, followed by PLS-DA discriminative analyses and SVM. The Mahalanobis distance was calculated using MATLAB 2013a (MathWorks, Massachusetts, USA). The quality of the model was evaluated by utilizing the determination coefficient, RMSE, and prediction set accuracy.

3. Results and Analysis

3.1. The Original Spectral Curves

The original spectral data gathered from the scanning of pork, beef, and mutton by infrared spectroscopy are shown in Figure 1. To discriminate the three kinds of meats better, the average values of the spectral data of each meat were calculated. As shown in Figure 2, there were strong peaks at the wavelengths of 2925 cm−1, 2855 cm−1, and 1745 cm−1 and several weak peaks between 750 and 1800 cm−1 called the fingerprint region. Figure 2 shows that the absorbed spectrum in the pork, beef, and mutton meats has no significant differences. Therefore, in order to get a better sense of the differences between the three types of meat, the second derivative method combined with the Savitzky-Golay smoothing filter (polynomial order 2, points of window 10) was implemented into the average spectrum. As shown in Figure 3, after processing through the above methods, the peaks in the three types of meat were amplified. The peaks at 2925 cm−1 and 2855 cm−1 were due to lipids (CH2 stretching). The peak at 1745 cm−1 was due to lipids (C=O stretching). The peak at 1645 cm−1 was due to water (O-H stretch) and amide I (C=O stretching). The peaks at 1401 and 1464 cm−1 were due to lipids (C-H stretching) and amide III. Other absorption peaks between 900 and 1200 cm−1 were associated with carbohydrates (C-O or C-C stretching) [4547]. The intensity difference of the peaks could be considered the amount of the compound contents in different meat. For example, we could see that the peak at 2855 cm−1 in pork was stronger than that in beef and mutton, which reveals that a certain kind of lipid (CH2 stretching) had different contents in pork, beef, and mutton samples. Similarly, we could see that the peaks at 2925, 1464, and 1173 cm−1 also reveal the corresponding compound content differences.

3.2. The Outliers Excluded by Using Mahalanobis Distance

Using (1) and (2), δ = 3 was used in determining the threshold Mahalanobis distance of the original spectral data from the three types of meat. The outliers are presented in Figure 4.

As showed in Figure 4(a), the Mahalanobis distance number 53 was 18.55 which was far beyond the threshold value of 8.74. Therefore, number 53 sample of beef was excluded. In the pork (Figure 4(b)), the Mahalanobis distances of number 1 and number 53 samples were 11.44 and 10.14, respectively, which were beyond the threshold value of 7.66. Therefore, number 1 and number 53 samples of pork were excluded. In the mutton (Figure 4(c)), the Mahalanobis distance of number 20 samples was beyond the threshold value of 9.06. Therefore, number 20 sample of the mutton was ruled out. All features are model calibration sets for unit variance and the root-mean-square of calibration (RMSECa), root-mean-square error of cross-validation (RMSECVb), determination coefficient of calibration sets (R2cal), and determination coefficient of cross-validation sets (R2cv). Corrected mean square deviation and RMSECV have been increased, indicating that the outliers have the great impact on the model building (Table 1).

3.3. The Preprocessing of the Original Spectral Data

The undesired variations often constitute the major part of the total variation in the sample sets that can be observed as shifts in the baseline or noise such as scatter distortion. The first and second derivatives, MSC, SNV, Savitzky-Golay filter, and normalization were used to project the samples from being affected by the disturbances like the instrument, environment, and sample preparations. The results are shown in Table 2.

The first and second derivative datasets of preprocessing have not only increased the featured absorbed spectrum but also enhanced the noise, resulting in the decrease of the accuracy in the building model. Normalization processing was proved to be the best preprocessing method because it could avoid the effects of thickness and transmissivity of the samples; therefore, this processing could efficiently eliminate the differences from the same sample. Calibration sets and determination coefficient of test sets of normalization were more than 0.99; the RMSEC and RMSECV were 0.06 and 0.08, respectively. The building model based on normalization showed its exceptional features with 100% accuracy when it was used to test the 20 samples of each meat.

3.4. The Building Models of PLS-DA and SVM

Before building the model, we made the principal component analysis of the three fleshes. The beef, pork, and mutton could be effectively distinguished by the first three principal components. They contributed 44%, 35%, and 14% variance, respectively, which together contribute a total of 90% of the variance (Figure 5).

PLS-DA and SVM methods were used to build the discrimination models for the pork, beef, and mutton samples. During the model building, the samples were separated into two different sets as the calibration and test sets. Samples of pork, beef, and mutton were randomly chosen as 30, 39, and 39 correspondingly as calibration sets, and the remaining 20 samples were grouped into test sets.

In the PLS-DA method, the reference value of beef, pork, and mutton was set as −1, 0, and 1, respectively. We used the PLS-DA method to compare our predicted measurements against a reference of calibration sets and cross-validation sets as shown in Figure 6. The predicted values of beef, pork, and mutton were distributed around the obtained values of −1, 0, and 1, respectively. The RMSEC and RMSECV were 0.06 and 0.08; the determination coefficient was 0.99. When the model was used to predict the 20 samples of each meat in the test sets, the result is shown in Figure 7.

The accuracy of the prediction by the built model was evaluated by using the range of referenced value ± 0.5. The results showed that the predicted values of beef were between −1.09 and −0.92, those of pork were between −0.05 and 0.18, and those of mutton were between 0.70 and 1.08.The predicted values fell into the ranges of referenced values, and the accuracy rate was 100%.

While the model was built using the SVM method, the nu-SVM calculation method was used [48]; kernel function was chosen as radial basis function (RBF). In the model built on the SVM method, grid searching technique and cross-validation were combined to determine the error penalty coefficient γ and kernel function parameter nu. The searching range of grid searching technique was set as γ: 10−2-102 and nu: 0.01–1. After optimization, the value of γ and nu was 10 and 0.23, respectively. Figure 8 shows the result of three meat discrimination utilizing models built by the SVM method.

The features of the two models built by the PLS-DA and SVM methods, respectively, are shown in Table 3.

The accuracy of the prediction from the test sets by the PLS-DA and SVM methods reached 100%; meanwhile, the determination coefficients of the two built models were higher than 0.96; root-mean-square errors were below 0.25, indicating that the two methods used for model building were good enough for the discrimination of the three types of meat. However, determination coefficients from the calibration sets and test sets in the model built by PLS-DA were all higher than 0.99, higher than that by SVM. In addition, the root-mean-square errors from calibration sets, test sets, and cross-validation were all below 0.08; they were all lower than those from SVM, indicating that the model built by the PLS-DA method was the better model for the discrimination of pork, beef, and mutton.

4. Discussion

Accurate species identification of meat and its products is important to enforce acts related to livestock products, to maintain livestock product standards, to prevent unfair competition in the meat industry, to regard religious and socials customs, to control wild animals poaching, and so on. FTIR spectroscopy can be a perfect model for the analysis of the adulteration of meat samples. All the pork, beef, and mutton meat samples obtained from the market could be readily identified just based on the different structure and chemical compositions.

As shown in Figure 5, the pork, beef, and mutton could be identified simply based on the top three major components. However, during the PLS-DA model building based on these three major components, values of RMSECV and RMSEP were high. Therefore, 7 major components were included in model building, so as to keep the robustness avoid the overfitting state of the model [49]. In this report, good results could be predicted by the two built models (PLS-DA and SVM). However, the PLS-DA model showed less prediction error than the SVM model. This might be due to the SVM model which is good for less variant nonlinear spectral model [50, 51]. The spectral data contained many linear variants which caused the higher prediction error in the SVM model, which was identical with the previous report [52].

Pork, beef, and mutton are the kinds of meat with different structures and chemical components. Infrared spectroscopy can be used to quickly discriminate the three different types of meat based on the different chemical components which absorb different frequencies of wavelength. Earlier, the accuracy of pork, beef, and mutton was identified by infrared spectroscopy where it exhibited 99.28%, 97.42%, and 100%, respectively, using 10 major components for model built [10]. However, in this report, the accuracy of the identification of the three types of meat was 100% basing on the model built only by 7 major components, indicating that our established model showed higher accuracy and better robustness.

Rohman et al. [45] had used FT-IR spectroscopy for the detection and quantification of adulteration of beef meat ball with pork for Halal, based on specific functional groups of the fat contents. In our experiment, we used dried ground meat samples to avoid the intramuscular fat (IMF) deposit distortion within meat samples and to bring more homogeny while preparing the samples with potassium bromide, as observed in Figure 1. It is sometimes difficult to detect the variances between different kinds of meat spectra by the naked eye because of the indistinguishable similar peaks. To overcome this issue, we had employed the 2nd derivative spectral analysis for clear differentiation of the spectra, which allowed us to observe easily the variation of structures and chemical components between the meat samples. The fingerprint region obtained from the 2nd derivative spectra of different meat samples could be used to distinguish small differences in the structure of different compounds such as water, fat, proteins, and carbohydrates [47]. Those with different compositions or contents in different kinds of meats contributed the main spectral characteristics in the fingerprint region. Fortunately, some statistical tools like the Mahalanobis distance method and models like PLS-DA and SVM were used to exclude the outliers and to discriminate the meat samples [34, 35, 37, 38].

Since this paper mainly focused on the efficiency of the algorithm and model, some of the more rigorous experiments such as the different ratio of adulterated meat or processed meat were not discussed in this work. But through comparing the relevant papers, we could get a conclusion that the greater the difference in meat breeds, the higher the accuracy could get when distinguishing them. For example, when the meats from the same animal were measured by infrared spectroscopy, such as meats from the breed of white and Iberian pork [53], Norwegian salmon, and Chile Pacific salmon [54], the accuracy was decreased. This was very likely that the meats from the same animal share the similar chemical components. Another similar conclusion was that when the mixing ratio of meat was close, the distinguishing accuracy was lower than the high mixing ratio [46, 55]. Whatever, the infrared spectroscopy was again proved to be an ideal method for fighting against the adulteration of meats.

5. Conclusion

In conclusion, infrared spectroscopy was used to determine adulterated pork meat in beef or mutton in the market samples for routine analysis. The Mahalanobis distance method was successfully used to exclude the outliers, and the differences were removed by preprocessing procedures. Meat samples were discriminated using the models PLS-DA and SVM, then optimized with PLS-DA. The features of the model built by the PLS-DA method were R2cal and R2cv which reached 0.99, RMSEC which was 0.06, and RMSECV and RMSEP which were both 0.08; the accuracy of the model prediction was 100%. Overall, the model built by the PLS-DA method was preferable to that by the SVM method. Established infrared spectroscopy basing on PLS-DA could accurately discriminate the pork, beef, and mutton and will have promising market value.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was jointly supported by National Natural Science Fund (61501531), Fund for Science and Technology from Guangdong Province (2015A020209173), Guangzhou Industry University Research Cooperative Innovation Major Project (201704020030), and “Innovation and Strong Universities” special funds (KA170500G) from the Department of Education of Guangdong Province.