#### Abstract

The combined terahertz time-domain spectroscopy (THz-TDS) and chemometric technology is used to detect the adulteration of similar substances in Panax notoginseng powder. Four kinds of samples are prepared in the experiment, three kinds of adulterated samples are Panax notoginseng powder adulterating with zedoary turmeric powder, Panax notoginseng powder adulterating with wheat flour, and Panax notoginseng powder adulterating with rice flour, respectively. The values of adulterated concentration are from 5% to 60%, the interval of adulterated concentration is 5%, and the other sample is pure Panax notoginseng powder. The modeling and prediction sets are divided by 3 : 1 by class. The feature information of models is extracted by elimination of uninformative variable (UVE) method and successive projection algorithm (SPA); combining with back propagation neural network (BPNN), the UVE-BPNN and SPA-BPNN qualitative models are established, respectively. The model’s results show that the UVE-BPNN model is better; the classification accuracy of the prediction set of UVE-BPNN is 95%. Then, the least square support vector machine (LS-SVM) algorithm and partial least square (PLS) algorithm are used to establish the quantitative analysis model. The model’s results show that the LS-SVM model is better among the quantitative analysis models of zedoary turmeric powder and wheat flour, the correlation coefficient of prediction (R_{P}) is 0.90 and 0.93 of LS-SVM, respectively, and the root mean square error of prediction (RMSEP) of LS-SVM is 0.072 and 0.068, respectively. Among the quantitative analysis models for rice noodles, the PLS model is better, with the R_{P} of 0.94 and RMSEP of 0.06. The results show that the combined THz-TDS and chemometric technology can be used to determine the adulteration of similar substances in Panax notoginseng powder quickly, accurately, and nondestructively.

#### 1. Introduction

Panax notoginseng, mainly produced in southwest China, is an essential medicinal material. Modern pharmacological researches have shown that Panax notoginseng is world-famous for its hemostasis, anti-hypertension, anti-thrombosis, and neuroprotective effects [1, 2]. Some merchants mix other cheap powders with similar colors into the Panax notoginseng powder to achieve the purpose of replacing it with inferior ones; the wheat flour, rice flour, and zedoary turmeric powder which are similar to Panax notoginseng in appearance and physical properties are the most common pollutants. It is almost impossible for consumers to distinguish the purity of the powder by eyes.

At present, the primary methods for quality detection of Panax notoginseng are the high-performance liquid chromatography method, spectrometric method, the method of character identification, and so on. Yang et al. [3] used high-performance liquid chromatography to analyze the chemical composition and active component content of 215 samples of Panax notoginseng with different specifications, different plant parts, and different geographical areas. Li et al. [4] used near-infrared spectroscopy to detect the polysaccharide content in Panax notoginseng. Li et al. [5] used fluorescence spectroscopy to distinguish whether the adulterated counterfeit products were added in Panax notoginseng powder qualitatively. Shen et al. [2] used laser-induced breakdown spectroscopy(LIBS) to detect six nutrient elements in Panax notoginseng samples from 8 producing areas with high precision, and the PLS and LS-SVM quantitative analysis models were established. Meng et al. [6] took pictures of Panax notoginseng and its counterfeits under an electronic mirror, and then the identification of their micro-traits was studied. Although the high-performance liquid chromatography method has high accuracy and can analyze the chemical components in Panax notoginseng powder, it is not straightforward and challenging to operate. The testing equipment in the microscopic identification method is expensive, and the testing speed is slow. In addition, the anti-interference ability of near-infrared spectroscopy is poor and has low sensitivity. The fluorescence spectroscopy method does not analyze the qualitative of adulteration in Panax notoginseng, and the LIBS method is only used to explore the production area of Panax notoginseng. So, the THz-TDS is proposed to determine whether Panax notoginseng powder is adulterated and how many are the value of adulteration in Panax notoginseng. Comparing with other detection technologies, it is widely used in food safety detection due to its advantages such as nondestructive, easy operation, high precision, and short detection period [7–9].

There is no accurate and rapid detection technology given the adulterated Panax notoginseng with various kinds of starch in the market. The THz-TDS technology is proposed to qualitatively and quantitatively detect the adulteration of rice flour, wheat flour, and zedoary powder with different concentrations in Panax notoginseng powder. And combining with chemometric methods, the qualitative and quantitative optimal models for adulteration of Panax notoginseng powder are established, which provide the theoretical basis and experimental reference for the market to detect Panax notoginseng’s adulteration.

#### 2. Material and Method

##### 2.1. Sample Preparation

In this experiment, Panax notoginseng is bought from WENSHAN KANG MILLION AGRICULTURAL DEVELOPMENT CO,LTD’ company, and four types of samples are prepared: Class I is pure Panax notoginseng powder, Class II is Panax notoginseng powder adulterating with zedoary turmeric powder, Class III is Panax notoginseng powder adulterating with wheat flour, and Class IV is Panax notoginseng powder adulterating with rice flour. A total of 360 adulterated samples are prepared, with the percentage of concentration ranging from 5% to 60%, and the concentration interval is 5%. The detailed information of their concentration and the number of samples are shown in Table 1. The mixture of all adulterated samples is evenly mixed, and they are dried in a dryer to remove moisture. Then, a hydraulic press is used to press the pieces under the pressure of 10 MPa for 1 minute to prepare the tablets. The shape of the tablets with a diameter of about 13 mm and a thickness of about 0.8∼1.1 mm is round. The samples’ spectral data are collected by the THz-TDS instrument of TAS7400TS which Edwin Company of Japan develops.

##### 2.2. Variable Selection Method

The SPA, whose goal is to reduce co-linearity between different variables effectively, is a forward variable selection method [10]. The principle of SPA is to obtain the subset of variables with minimum co-linearity by using the simple projection computation of vector space. First, the maximum number of selected variables is set, and the starting vector in the m-dimensional space (M is the original variable) is selected. Secondly, the high projection vector in the orthogonal subspace is chosen as the new starting vector.

The UVE is a method basing on the stability analysis of regression coefficients of the PLS model [11]. This method is developed to eliminate the variables which have no useful information in the original spectral data. During the operation of the UVE algorithm, a group of random variables with the same dimension as the spectral matrix is generated manually as a reference. The stability value and threshold value are used to evaluate the reliability of each variable; the variables that the absolute values of stability are less than the critical value are deleted [8]. The stability value S is defined as follows:where the is the stability value of the *i*-th variable of the model and is the regression coefficient of the *i*-th variable in the sample of the model. and are the mean and standard deviation of , respectively, and *m* is the number value of input variables.

##### 2.3. Modeling Method

The BPNN algorithm is a nonlinear multi-layer feed-forward neural network consisting of the input layer, hidden layer, output layer, and other structures [12]. Figure 1 shows the topology of the BPNN. The circle in Figure 1 represents the neuron. Spectral data are inputted from the input layer, the information is standardized, and the weight value transmitted to the hidden layer is given. The predicted value of the BPNN is obtained in the output layer, and the obtained value is compared with the expected value. Suppose the value of error is larger than the expected value. In that case, the value of error is propagated in reverse, and the threshold and weight values are adjusted until the value of error is less than or equal to the expected value.

The LS-SVM algorithm, which is upgraded and improved from the SVM algorithm, can simultaneously deal with linear and nonlinear multivariate calibration problems. There are three main types of kernel functions of LS-SVM: linear kernel function, polynomial kernel function, and radial basis kernel function. Compared with the linear kernel function and the polynomial kernel function, the radial basis kernel function can reduce the computational complexity of the training process better. It also deals with the nonlinear relationship between spectral data and truth value.

The PLS is a multivariate correction algorithm, and it is established based on the characteristics of principal component analysis and multiple regression [13]. The performance of PLS is evaluated from two aspects of accuracy and linearity. Its accuracy can be assessed by the root mean square error of prediction (RMSEP) [14]. The formula of PLS is as follows:where the *T* and *U* are the characteristic matrices of the spectral matrix *X* and the concentration matrix *Y*, respectively, and *P* and *Q* are the loading matrices of the spectral matrix *X* and the concentration matrix *Y*. and are the fitting residual items of the spectral matrix *X* and the concentration matrix *Y*.

Then, the linear regression of sum of PLS is done:where *B* is the regression coefficient matrix.

Finally, the predicted value of concentration is obtained by the following formula:

Partial least square discriminant analysis (PLS-DA) is a linear discriminant analysis algorithm based on the PLS regression. It is a management method for classification purposes and explains the maximum difference between defined sample groups.

#### 3. Qualitative Analysis

Figure 2 shows the absorption coefficient of the Panax notoginseng mixing with 20% concentration of adulteration powders in 0.5–2 THz. From Figure 2, we can see that the absorption coefficient curve of the four types has no peak or valley, and the absorption strengths of the four types are different. Those show that Panax notoginseng mixed with 20% concentration of adulteration powders changes the absorption strength. In this paper, the absorption coefficient in 0.5–2 THz is selected for modeling.

##### 3.1. Feature Information Extraction

When the UVE method is used to select the spectral variables, a group of random noises is introduced, in which the number of random noises is the same as the number of the spectrum variables. The result of UVE is shown in Figure 3, which is the spectral variables on the left side and is a computer-generated random noise on the right side, the value of ordinate is the stability of the spectrum index, the absolute value of ordinate is more excellent, the model is impacted bigger, and the importance of horizontal ordinate is the corresponding serial number of the spectrum and random noise; the two dashed lines are the thresholds selected by UVE, the variables outside the entries are retained, and the variables inside the two threshold lines are eliminated. After selection, 80 variables are obtained, and then the variables are inputted into the BPNN to establish a model.

The SPA algorithm selects the spectral variables in 0.5–2.0 THz, and the number of variable selection is set from 10 to 100. As shown in Figure 4(a), when the number of variables is more than 53, the value of RMSE is almost constant small, so the 53 variables by SPA are appropriate. The distribution of SPA variable selection results is shown in Figure 4(b).

**(a)**

**(b)**

##### 3.2. BPNN Model

The feature information selecting by UVE is imported into BPNN. After several adjustments, the optimal number of hidden layer nodes is 5 in the BPNN model. The classification threshold is 0.5, and the four categories are Panax notoginseng powder, Panax notoginseng powder adulterating with zedoary turmeric powder, Panax notoginseng powder adulterating with wheat flour, and Panax notoginseng powder adulterating with rice flour. The adulterating samples are all powders with the percentage of concentration ranging from 5% to 60%. As shown in Figure 5(a), when the seventh generation is run in the BPNN model, the local extreme value appears, and the RMSE is 0.2323. This shows that the model obtained has obtained the best performance. The prediction set results of the BPNN model are shown in Figure 5(b). From Figure 5(b), we know that except for the prediction sample of Panax notoginseng powder, a small amount of the others are predicted wrong. In the prediction sample of adulterating zedoary turmeric powder, 1 sample is misjudged as adulterating with wheat flour, and 1 sample is misjudged as adulterating with rice flour. In the prediction sample of Panax notoginseng powder adulterating with wheat flour, three samples are misjudged as adulterating with zedoary powder. In the prediction sample of Panax notoginseng powder adulterating with rice flour, 1 sample is misjudged as adulterating with wheat flour, and the overall prediction set accuracy of the model is 95%. The results show that the THz-TDS spectral technology can identify similar substances adulterating in Panax notoginseng powder with UVE-BPNN.

**(a)**

**(b)**

The variables selected by SPA are input into BPNN. As shown in Figure 6(a), when the tenth generation is run in the BPNN model, the local extreme value appears, and the RMSE is 0.1218. The prediction set results of the BPNN model are shown in Figure 6(b). From Figure 6(b), we know that a small amount of the others are predicted wrong. In the prediction sample of adulterating with zedoary turmeric powder, two samples are misjudged as adulterating with rice flour. In the prediction sample of Panax notoginseng powder adulterating with wheat flour, 1 sample is misjudged as adulterating with rice flour, and 1 sample is misjudged as adulterating with zedoary turmeric powder. In the prediction sample of Panax notoginseng powder adulterating with rice flour, 1 sample is misjudged as adulterating with wheat flour, and two samples are misjudged as adulterating with zedoary turmeric powder. In the prediction sample of Panax notoginseng powder, two samples are misjudged as adulterating with wheat flour, and the overall prediction set accuracy of the model is 92.5%.

**(a)**

**(b)**

#### 4. Quantitative Analysis

##### 4.1. Spectral Analysis

Figure 7 shows the average absorption coefficients of different adulterated mass fraction concentrations at 0.5–2.0 THz, changing with the frequency in 0.5–2.0 THz. Figure 7(a) shows the average absorption coefficient of Panax notoginseng powder adulterating with zedoary turmeric powder from 10% to 60% concentrations. It shows that the absorption coefficient decreases with the increase of adulteration concentration. Figure 7(b) shows the average absorption coefficient of Panax notoginseng powder adulterating with rice flour from 10% to 60% concentrations. The absorption coefficient decreases with the increase of adulteration concentration. Figure 7(c) shows the average absorption coefficient of Panax notoginseng powder adulterating with wheat flour; similarly, the absorption coefficient decreases with the increase of the adulterating concentration. In the frequency range of 0.5–2.0 THz, the absorption coefficients of three kinds of adulteration substances all show a trend of increasing frequency. The absorption coefficients by Panax notoginseng powder and adulteration substances increase with the increase of frequency at 0.5–2.0 THz.

**(a)**

**(b)**

**(c)**

##### 4.2. LS-SVM Model

In this paper, the LS-SVM algorithm analyzes the concentrations of three types of adulteration substances quantitatively. The adulteration concentrations of samples are ranging from 5% to 60%. According to the adulteration types, the samples are randomly divided into the modeling set and the prediction set at a ratio of 3 : 1, and the quantitative analysis models of the adulteration of Panax notoginseng powder are established with the kernel of radial basis function (RBF) and the kernel of linear function (LIN). Figure 8(a) shows the scatter diagram of the LS-SVM prediction model of Panax notoginseng powder adulterating with zedoary powder. The accuracy of the two types of kernel functions is very close; when the RBF is used as the input, RP is 0.9015 and RMSEP is 0.0723; when the linear kernel function is taken as the input function, Rp is 0.9012 and RMSEP is 0.0724. Figure 8(b) and Figure 8(c) are the scatter plots of the LS-SVM prediction model of Panax notoginseng powder adulterating with wheat flour and Panax notoginseng powder adulterating with rice flour, respectively; from Figure 8(b), we know when the input function is the RBF, the higher prediction accuracy is obtained; the Rp is 0.9306, and RMSEP is 0.0677; from Figure 8(c), we know when the RBF is used as the input function, the prediction model result is slightly better; the R is 0.9363, and the RMSEP is 0.0619.

**(a)**

**(b)**

**(c)**

The prediction results of the LS-SVM quantitative model are shown in Table 2. It can be seen from the table that when the Panax notoginseng powder adulterating with zedoary turmeric powder, the result is better when the type of kernel function is LIN. To detect Panax notoginseng powder adulterating with wheat flour and rice flour, the best results are obtained when the input kernel function is RBF.

##### 4.3. PLS Model

In establishing the PLS quantitative model, it is crucial to select the appropriate number of principal factors. If the number of main factors is too large, more useless information is retained, and the model’s accuracy is affected; if the number of principal factors is too tiny, some critical spectral information is ignored, and the accuracy of the model is affected [15, 16]. The root mean square error of calibration (RMSEC), RMSEP, the correlation coefficient of calibration (R_{C}), and Rp are used to assess the model. The absorption coefficient spectra of the samples are input to the PLS to establish the quantitative analysis model of PLS. As shown in Figure 9(a), the optimal number of principal factors is 3; as shown in Figure 9(b), the results of the PLS model of Panax notoginseng powder adulterating with zedoary powder are bad, the Rp is 0.6328, and the RMSEP is 0.13241. As shown in Figure 9(c), the optimal number of principal factors is 8; as shown in Figure 9(d), the results of the PLS model of Panax notoginseng powder adulterating with rice flour are good, the Rp is 0.9424, and the RMSEP is 0.0601. As shown in Figure 9(e), the optimal number of principal factors is 9; as shown in Figure 9(f), the results of the PLS model of Panax notoginseng powder adulterating with wheat flour are good, the Rp is 0.9047, and the RMSEP is 0.0771.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

Table 3 shows the results of the PLS quantitative analysis model for three types. The results show that the PLS is feasible for quantitative identification of Panax notoginseng powder adulterating with rice flour and wheat flour. The correlation coefficients of the modeling set and prediction set of the PLS model are above 0.9. However, the detection of Panax notoginseng powder adulterating with zedoary turmeric has poor results. The correlation coefficients of the modeling set and the prediction set are low, and the RMSE values are significant.

#### 5. Model Evaluation

In qualitative analysis, the classification accuracy of the prediction set is used to evaluate the model, the BPNN models combining with UVE and SPA are established, respectively, and high classification accuracy can be obtained. It is found that the UVE-BPNN model, whose classification accuracy is 0.95, is better. In quantitative analysis, the model is evaluated by R and RMSE of the model; the value of R is higher. The value of RMSE is smaller, and the accuracy of the model is more increased. The value of RMSEC is closer to the value of RMSEP, and the stability of the model is better. The quantitative analysis modes of the adulteration of Panax notoginseng powder are established by LS-SVM and PLS. The best result is obtained by using LS-SVM under LIN in the quantitative analysis of Panax notoginseng powder adulterating with zedoary turmeric powder. The value of Rp is 0.9015, and the value of RMSEP is 0.0723. In the quantitative analysis of Panax notoginseng powder adulterating with wheat flour, the best result is obtained by using LS-SVM under RBF. The value of Rp is 0.9315, and the value of RMSEP is 0.0677. In the quantitative analysis of Panax notoginseng powder adulterating with rice flour, the best result is obtained by using PLS. The value of Rp is 0.9424, and the value of RMSEP is 0.0601.

#### 6. Summary

The qualitative and quantitative analysis of the impurity of similar substances in Panax notoginseng powder is conducted basing on THz-TDS. By comparing the spectra of the same adulteration samples with different concentrations and the spectra of varying adulteration samples, it is found that the THz spectra of the samples show significant differences for the same adulteration samples with different concentrations, and the spectral information of samples with varying types of adulteration also shows significant differences. In the qualitative analysis of three different types of adulteration, this UVE and SPA are adopted to extract feature information, and then the BPNN qualitative analysis models are established, respectively. The model result shows that the classification accuracy of the UVE-BPNN qualitative model prediction set is 95%, and the classification accuracy of the SPA-BPNN qualitative model prediction set is 92.5%. In quantitative analysis, the quantitative analysis models of LS-SVM and PLS are established, the model results show that the values of Rp of LS-SVM model are greater than 0.90, and the values of RP of three kinds of adulteration (zedoary turmeric powder, wheat flour, and rice flour) are 0.9015, 0.9305, and 0.9343, respectively. The values of RMSEP are 0.0723, 0.0677, and 0.0619, respectively. In the quantitative model of PLS, the values of Rp are 0.6328, 0.9047, and 0.9424, respectively, and the values of RMSEP are 0.1341, 0.0771, and 0.0601, respectively. The results show that THz-TDS technology can be used to qualitatively and quantitatively detect the three kinds of substances of adulteration in Panax notoginseng powder; the research provides a new idea for detecting the adulteration of Panax notoginseng powder.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.