Research Article
Tracing Geographical Origins of Teas Based on FT-NIR Spectroscopy: Introduction of Model Updating and Imbalanced Data Handling Approaches
Table 3
Improving the prediction of 2015 tea samples using three outlier detection based undersampling approaches.
| Classifier | One Class SVM (OC-SVM) | Isolation forest (IF) | Elliptic envelope (EE) | Recall 1 | Recall 2 | MARa | Recall 1 | Recall 2 | MAR | Recall 1 | Recall 2 | MAR |
| LDA | 0.4993 | 0.3700 | 0.4346 | 0.4643 | 0.4100 | 0.4372 | 0.4498 | 0.3900 | 0.4199 | SVM | 0.7409 | 0.7300 | 0.7355 | 0.7191 | 0.7600 | 0.7395 | 0.7118 | 0.7500 | 0.7309 | SGDb | 0.5997 | 0.5100 | 0.5549 | 0.5997 | 0.5100 | 0.5549 | 0.5662 | 0.5100 | 0.5381 | Decision tree | 0.6652 | 0.5700 | 0.6176 | 0.6958 | 0.5800 | 0.6379 | 0.6201 | 0.5400 | 0.5800 | Random forest | 0.7089 | 0.6300 | 0.6694 | 0.6812 | 0.6200 | 0.6506 | 0.6710 | 0.6100 | 0.6405 | AdaBoost | 0.6405 | 0.6800 | 0.6602 | 0.7322 | 0.5400 | 0.6361 | 0.6667 | 0.5600 | 0.6133 | MLPc | 0.6856 | 0.5200 | 0.6028 | 0.8020 | 0.3800 | 0.5910 | 0.7205 | 0.3800 | 0.5503 |
|
|
aMAR: macro average recall; bSGD: stochastic gradient descent; cMLP: multilayer perceptron.
|