This paper aims to provide a stable instrumental method for provenance discrimination of Anji-White tea by its distinctive taste. 180 authentic and 60 counterfeit white tea samples were collected for specific geographical origins detection; all of them were measured by electronic tongue coupled with 7 independent sensors. Therefore, chemometrics methods, principal component analysis (PCA), and partial least squares discriminant analysis (PLSDA) were performed in classification. The PCA distribution shows that, in provenance analysis, PCA is a simple and reliable tool for small sample sets, but for sets with large objects, PCA seems powerless in classification. Therefore, PLSDA was applied to develop a classification model. The prediction sensitivity and specificity of PLSDA, respectively, reached 0.917 and 0.950. This study demonstrates the potential of combining electronic tongue system and chemometrics as an effective tool for specific geographical origins detection in Anji-White tea.

1. Introduction

Green tea, made from the leaves of the Camellia sinensis plant, is one of the most popular beverages in the world. Moreover, it is an excellent source of antioxidants such as polyphenols, polysaccharides, and amino acids, so green tea has been widely consumed as a healthy drink for its preventing functions on obesity, cancer, liver, and cardiovascular diseases [13]. Tea plants are widely distributed in over 30 countries and play a significant role in their economies; the property and chemical components of green tea are influenced by many factors, such as climate, fertilization conditions, geographical origins, and processing procedures. Among these factors, the specific geographical origins are universally accepted as an important aspect [46]. Therefore, in China, most famous teas are named after their geographical origins, such as the Anji-White tea, the West Lake-Longjing tea, the Anxi-Tieguanyin tea, and the Wuyi-Rock tea.

Anji-White tea (AWT) is produced in Anji County (Zhejiang province, China) and has been awarded the protected geographical indication (PGI). It is called a “white” tea because its leaves are very light in color due to its low chlorophyll and polyphenol contents [7]. AWT has abundant amino acids and its fragrance can last for long time. However, the actual yield of AWT is limited and can hardly afford the increasing market demand; some merchants fraudulently label “Anji-White tea” indication to non-Anji-White teas (NAWT) for illegal profits. Although these counterfeit products are inferior to the authentic AWT, their appearances are similar and can hardly be distinguished just by naked eyes. Until now, for provenance, the conventional discrimination way is sensory analysis; unknown tea sample was tasted by professional tea tasters; then its quality was described based on a series of sensory scores. However, the result of sensory analysis depends largely on the subjective judgment of the taster, and it is very expensive and time-consuming to train a professional tea taster. Therefore, an urgent demand exists for developing a nonhuman technique to discriminate the geographical origins of white teas.

In recent years, analytical methods based on instruments have been widely reported in food quality control such as wine, milk, and juice [8, 9]. In these researches, electronic tongue sensors possessed good stability and sensitivity. Moreover, a good correlation between human and electronic tongue judgment has been observed, which makes it a promising alternative to human sensory analysis of teas.

This paper was focused on developing an instrumental technique for discriminating the provenance of Anji-White tea by electronic tongue system and chemometrics, to model the complicated relationship between the taste and the geographical origin; principal component analysis (PCA) and partial least squares discriminant analysis (PLSDA) models were used in classification [1013].

2. Materials and Methods

2.1. Tea Samples

In this study, 180 authentic AWT samples were collected from local plantations in 30 white tea production sites of Anji County. And 60 NAWT samples were collected from 10 different production sites such as Miaoxi, Jingan, and Guangde. All of the samples were preserved in a cold storage covered with lightproof packaging before analysis. The detailed information about the samples is listed in Table 1.

2.2. Electronic Tongue Analysis

3 grams of each sample was added to 150 mL boiling distilled water and infused for 10 minutes. Then, the infusion was filtered into a glass beaker and cooled to 25°C in water bath for analysis. An ASTREE II electronic tongue system (Alpha M.O.S., Toulouse, France) coupled with a reference electrode and 7 independent liquid sensors (ZZ, BA, BB, CA, GA, HA, and JB) was performed in instrumental sensory analysis. In order to achieve the reliable data, the instrument was calibrated by standard NaCl (0.01 mol/L), HCl (0.01 mol/L) and Monosodium Glutamate (0.01 mol/L) solutions before the test. The signal acquisition interval was 1 s, the acquisition time was 180 s, and the stable signal exported at 180 s was saved for provenance analysis. The cross-sensitivity and selectivity of the sensor array are displayed in Table 2.

2.3. Principal Component Analysis

The whole data analysis was performed on MATLAB (Mathworks, Sherborn, MA). As the 7 sensors exported the responses, respectively, multivariate statistical model is necessary in classification. Principal component analysis (PCA) is an unsupervised chemometrics tool used for pattern recognition and dimension reduction [14]. The data could be reconstructed by linearly uncorrelated variables (principal components) with considerable information loss. In this paper, 30 AWT samples from 5 provenances were discriminated using PCA model; the first three components were applied, so the objects were distributed in a three-dimensional PCA score figure. Similar, 30 NAWT samples were classified. Then, all of the samples in this study were analyzed by three-dimensional PCA model in the same way.

2.4. Partial Least Squares Discriminant Analysis

Partial Least Squares Discriminant Analysis (PLSDA) has been regarded as one of cornerstones in chemometrics because it can handle the large (the number of variables in the data) problem very well [15]. In PLSDA, representative samples are necessary for model training, as the math model developed, unknown objects could be sorted. In this paper, both AWT and NAWT samples were divided into a training class and a predicting class. To ensure that samples in predicting class were distributed uniformly around the training class, the Kennard and Stone (K-S) algorithm was used for division [16]. Afterwards, the AWT training class and the NAWT training class were taken as a total training class. Similarly, the AWT predicting class and the NAWT predicting class were also taken as a whole. Therefore, the total training class is arranged in matrix including variables with objects. response vector demonstrates the corresponding category of each object in matrix . The value 1 and value −1 are set to denote the AWT and NAWT objects, and the value 0 is the cut-off value. In prediction, an unknown object will be classified into AWT class if its response value is above 0; otherwise, it will be classified into NAWT class.

For PLSDA, latent variables control the complexity of model. Inadequate latent variables will lead to an underfitted model; excessive latent variables will raise the risk of overfitting, so the number of latent variables is an important parameter in classification. Thus, Monte Carlo cross-validation (MCCV) was adopted to optimize the model [17]; the parameter with the minimal misclassification rate of MCCV (MRMCCV) was calculated by following formula:where is the times of splitting, is the number of misclassified objects for the th splitting, and is the number of total prediction objects.

To evaluate the performance of PLSDA model, sensitivity and specificity were computed [18], the AWT was denoted as “positive,” and the NAWT was denoted as “negative,” and then sensitivity and specificity were calculated as follows:where TP, FN, TN, and FP, respectively, represent the number of true positives, false negatives, true negatives, and false positives.

3. Results and Discussions

Average responses of the sensors to AWT and NAWT are showed in Figure 1. As seen from Figure 1, the responses of BA, GA, HA, and JB to AWT and NAWT are very approximate, the AWT objects have stronger intensities for ZZ and BB sensors, and NAWT objects get higher CA signal than AWT objects.

The PCA score results are demonstrated in Figure 2. Three principal components of the data were extracted in analysis. The variance information of three PCA models is listed in Table 3. Seen from Table 3, three PCA models (models A, B, and C) perform well in statistics because all of their cumulative variances exceed 98%. As plotted in Figure 2, the AWT/NAWT samples from 5 different provenances are clearly discriminated in model A/B. But in model C, 180 AWT and 60 NAWT objects overlap with each other; in other words, PCA can hardly satisfy the demand of provenance discrimination.

PLSDA is an effective technique in classification. In this paper, the K-S algorithm was used to obtain the training class and the predicting class; finally, 120 AWT and 40 NAWT samples were picked for training; other 60 AWT and 20 NAWT samples were used for prediction. MCCV was then used to estimate the number of latent variables (from 1 to 7); the training class was randomly split into secondary training class (50%) and secondary predicting class (50%) for 100 times; by calculating and simulating, the minimal MRMCCV value was obtained with a four-component model. Therefore, a classification model with 4 latent variables was developed and the predicting class was used to evaluate the performance of this model; Figure 3 shows the training and predicting results of the PLSDA model. For training, the sensitivity reached 0.933 (112/120) and the specificity reached 0.950 (38/40). In prediction, the sensitivity of PLSDA model is 0.917 (55/60), and the specificity is 0.950 (19/20); only 5 AWT and 1 NAWT objects are misclassified in prediction.

4. Conclusion

The feasibility of combining electronic tongue and chemometrics for provenance detecting of Anji-White tea was investigated. Electronic tongue is a rapid and stable instrument, and it takes little sample loss in analysis, so it is suitable for food quality control, especially the taste analysis of tea products. PCA and PLSDA models were performed in classification. The PCA results demonstrate that objects from small sample set could be clearly discriminated by their geographical origins in a three-dimensional PCA score plot. However, PCA is an unsupervised tool; the provenance or other sample features are not involved in separating; consequently, the AWT and NAWT objects overlap seriously as all of the objects are analyzed in a PCA model. PLSDA has a better performance in provenance discrimination than PCA; both the sensitivity and the specificity are satisfying in prediction. But PLSDA is more complicated in modeling, it requires plenitudinous and representative samples for training, and it takes a long time for model optimization. In our future work, more classification models like artificial neural network and support vector machine will be investigated for provenance discrimination.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.