Journal of Spectroscopy

Review Article

The Quality Control of Tea by Near-Infrared Reflectance (NIR) Spectroscopy and Chemometrics

Table 1

Overview of NIR spectroscopy for the quality control of tea.


Commodity	Attributes	Methods	Wavelength scanned	Spectral pretreatment	Calibration models	Results	No. of samples	References

Tea leaves	Caffeine, catechin (gallic acid, GC, EGC, C, EGCG, EC, GCG, ECG)	NIR	400∼2500 nm	Win ISI Score	MPLS	r² for caffeine: 0.97; GA: 0.85; GC: 0.78; EGC: 0.95; C: 0.91; EGCG: 0.97; EC: 0.95; GCG: 0.85; ECG: 0.94	665	[38]
Black tea	Amino acids, caffeine, theaflavins, water extract	FT-NIR	4000∼10,000 cm⁻¹	SNV, MSC	PLS, Si-PLS, GA-PLS, Bi-PLS	Using GA-PLS, R_p for amino acids: 0.9498; water extract: 0.8785; using Bi-PLS, R_p for caffeine: 0.9232; theaflavins: 0.924	95	[39]
Black, dark, oolong, and green tea	Total polyphenols, caffeine, free amino acids	FT-NIR	4000∼10,000 cm⁻¹	MSC combined with first-order derivative and SG smoothing	PLS1, PLS2, RF-PLS, CARS-PLS	CARS-PLS () achieved best predictive performance for total polyphenols: 0.994; caffeine: 0.986; free amino acids: 0.993	145	[35]
Green tea	Total polyphenols	VIS-NIR	300∼1000 nm	SNV	PLS, Si-PLS, CARS-Si-PLS, GA-Si-PLS	Prediction set (R_p) for PLS: 0.8043; Si-PLS: 0.8804; GA-Si-PLS: 0.8859; CARS-Si-PLS: 0.8753	50	[40]
Tea extract	Total polyphenols	VIS-NIR	300∼1000 nm	SNV	PLS, Si-PLS, GA-PLS, CARS-PLS, ACO-PLS	Prediction set (RMSEP) for PLS: 0.7659; Si-PLS: 0.8766; GA-PLS: 0.8993; CARS-PLS: 0.8897; ACO-PLS: 0.8853	85	[41]
Longjing tea leaves	Moisture content	NIR hyperspectral imaging	874.41∼1733.91 nm	Smoothing filter (3 ∗ 3 window) MNF rotation, 2D filter LoG (Laplacian of Gaussian)	PCPLS1-9, SPA-PLS	r² for PCPLS1-9: 0.9491, 0.8826, 0.9531, 0.8905, 0.9548, 0.9105, 0.9713, 0.9071, and 0.9610; SPA-PLS: 0.9216	30	[42]
“Biluochun” green tea	Sensory attributes	FT-NIR	4000∼10,000 cm⁻¹	SNV	Si-PLS, PCA, BPNN, BP-AdaBoost	BP-AdaBoost model revealed its superior performance, R_p = 0.7717	70	[37]
Black tea	Color sensory quality	VIS-NIR	200∼1100 nm	SNV	GA-BPANN	R_p = 0.8935	127	[18]
Black tea	Theaflavin, thearubigin	NIR	1000∼1799 nm	MSC, SG 1st derivative, Min/Max, SNVT	PLS, SI-PLS, SI-CARS-PLS, SI-CARS-ELM, SI-CARS-SVM, SI-CARS-ELM-AdaBoost	ELM-AdaBoost was used for the validation, = 0.893	78	[43]
Green tea	Lutein, Chl-b, Chl-a, Phe-b, Phe-a, -carotene	VIS-NIR	400∼2498 nm	ANOVA	PLS, SPA, MLR	MLR gave superior prediction () for lutein: 0.975; Chl-b: 0.973; Chl-a: 0.993; Phe-b: 0.919; Phe-a: 0.962; β-carotene: 0.965	135	[44]
White tea and albino tea	Tea polyphenols, free amino acids, moisture, ash contents	FT-NIR	4000∼12,400 cm⁻¹	MSC, SNV, SG smoothing, KND, 1st and 2nd derivatives	DPLS, DA	DPLS: 98.48; DA: 100	70	[45]
Green, yellow, white, black, and oolong tea	Region of interest	VIS-NIR	589, 635, 670, 783 nm	SNV	LDA, Lib-SVM, ELM	Lib-SVM was the best model, r² = 98.39%	206	[46]
Green, yellow, white, black, and pu-erh tea	—	NIR	950∼1760 nm	SG smoothing, standard deviation, SNV	PCA, MDS, t-SNE, ISOMAP, SVM-ECOC	SVM-ECOC model provided a classification accuracy of 97.41 ± 0.16%	6	[19]
Iron Buddha tea	Total polyphenols	VIS-NIR	800∼2500 nm	SNV	PLS (LS-SVM and BPNN)	Classification accuracies: LS-SVM: 95.0%; BPNN: 97.5%	180	[47]
Pu-erh tea	Metabolomics analysis	NIR	3600∼12,500 cm⁻¹	OPUS 7.2 software from Bruker Optics	PCA, PLS, HCA, PLS-DA	PLS model showed nearly complete fit and excellent predictive capability (r² = 0.967; Q² = 0.93)	17	[48]
Green tea (Anji-white)	—	NIR	4000∼12,000 cm⁻¹	Smoothing, 2nd derivative, SNV	OCPLS, SIMCA	With SNV preprocessing, OCPLS provided sensitivity of 0.886 and specificity of 0.951; SIMCA provided sensitivity of 0.886 and specificity of 0.938 and achieved best classification performance	248	[36]
Green tea	Catechin, EC, EGC, ECG, EGCG, GCG	NIR	1050∼2500 nm	1st derivative	PLS, BP-ANN, SVM	Accuracy (%): PLS: 100.000; BP-ANN: 95.455; SVM: 98.485	220	[49]
Green tea	—	NIR	4000∼9000 cm⁻¹	2nd derivative, SNV	OVR-PLSDA, OVO-PLSDA, PLSDA-softmax, ES-PLSDA	Total accuracy (%): OVR-PLSDA: 64.68; OVO-PLSDA: 84.94; PLSDA-softmax: 92.99; ES-PLSDA: 93.77	1540	[50]
Black tea	Caffeine, water extract, total polyphenols, free amino acids	NIR	4000∼12,500 cm⁻¹	SNV, MSC, Min/Max	PLS	(1) R in the prediction set for caffeine: 0.955; water extracts: 0.962; total polyphenols: 0.954; free amino acids: 0.927 (2) Identification accuracy (%): 94.30	140	[51]
Green and black tea	—	NIR	3800∼14,000 cm⁻¹	1st derivative, SG smoothing	SIMCA, PLSDA, SPA-LDA	Classification accuracy (%): SIMCA: 88.00; PLSDA: 92.00; SPA-LDA: 100	82	[52]
Oolong tea	—	NIR	4000∼12,000 cm⁻¹	SNV, 2nd derivative, smoothing	PLSDA	The sensitivity of PLSDA model for raw data: 0.971; SNV: 1.000; 2nd derivative: 0.886; smoothing: 0.971	570	[53]
Oolong tea	Polyphenols, alkaloids, protein, volatile and nonvolatile acids, aroma compounds	NIR and NMR	3300∼12,500 cm⁻¹	SNV, 2nd derivative, SG smoothing	PCA, PLSDA	Discrimination accuracy (%) for NMR + NIR data: 86.20∼95.80; NMR data: 68.20∼78.70; NIR data: 80.00∼89.30	90	[17]

Abbreviations: ACO, ant colony optimization; ANOVA, one-way analysis of variance; Bi-PLS, backward interval PLS; BP-ANN, backpropagation artificial neural network; BPNN, backpropagation neural network; C, (+)-catechin; CARS-PLS, competitive adaptive reweighted sampling-partial least squares; Chl-a, chlorophyll a; Chl-b, chlorophyll b; EC, (−)-epicatechin; ECG, (−)-epicatechin gallate; EGC, (−)-epigallocatechin; EGCG, (−)-epigallocatechin gallate; ELM, extreme learning machine; ES, ensemble strategy; NIR: near-infrared reflectance; FT-NIR: Fourier transform near-infrared reflectance; GA, genetic algorithm; GC, (−)-gallocatechin; GCG, (−)-gallocatechin gallate; ISOMAP, isometric mapping; KND, Karl Norris derivative filter; LDA, linear discriminant analysis; PLS, partial least squares; PLSDA, partial least squares discriminant analysis; Lib-SVM, library support vector machine; MDS, multidimensional scaling; Min/Max, min/max normalization; MLR, multiple linear regression; MNF, minimal noise fraction; MPLS, modified partial least squares; MSC, multiplicative scattering correction; NMR, nuclear magnetic resonance; OCPLS, one-class partial least squares; OVO-PLSDA, one-versus-one-partial least squares discriminant analysis; OVR-PLSDA, one-versus-rest-partial least squares discriminant analysis; PCA, principal component analysis; Phe-a, pheophytin a; Phe-b, pheophytin b; Q², cross-validated correlation coefficient; r², coefficient of determination in the prediction set; R_p, correlation coefficient in the prediction set; , determinate coefficient; RF-PLS, random frog-partial least squares; SG smoothing, Savitzky–Golay smoothing; SIMCA, soft independent modeling of class analogy; Si-PLS, synergy interval partial least squares; SNV, standard normal variate; SNVT, standard normal variate transformation; SPA-LDA, successive projections algorithm associated with linear discriminant analysis; SVM, support vector machine; SVM-ECOC, error-correcting output code (ECOC) model containing support vector machine (SVM); t-SNE, t-distributed stochastic neighbor embedding; VIS-NIR, visible and near-infrared reflectance; —, not mentioned.