Research Article

Tracing Geographical Origins of Teas Based on FT-NIR Spectroscopy: Introduction of Model Updating and Imbalanced Data Handling Approaches

Table 3

Improving the prediction of 2015 tea samples using three outlier detection based undersampling approaches.

ClassifierOne Class SVM (OC-SVM)Isolation forest (IF)Elliptic envelope (EE)
Recall 1Recall 2MARaRecall 1Recall 2MARRecall 1Recall 2MAR

LDA0.49930.37000.43460.46430.41000.43720.44980.39000.4199
SVM0.74090.73000.73550.71910.76000.73950.71180.75000.7309
SGDb0.59970.51000.55490.59970.51000.55490.56620.51000.5381
Decision tree0.66520.57000.61760.69580.58000.63790.62010.54000.5800
Random forest0.70890.63000.66940.68120.62000.65060.67100.61000.6405
AdaBoost0.64050.68000.66020.73220.54000.63610.66670.56000.6133
MLPc0.68560.52000.60280.80200.38000.59100.72050.38000.5503

aMAR: macro average recall; bSGD: stochastic gradient descent; cMLP: multilayer perceptron.