Evidence-Based Complementary and Alternative Medicine

Research Article

Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine

Table 5

Model performance on TDS test data sets.


Model	Accuracy	Precision	Recall	F1-score

Encoder (Char)-Decoder (Char)	0.7980 ± 0.0078^a,b	0.7876 ± 0.0147^a,b	0.7678 ± 0.0088^a,b	0.7775 ± 0.0106^a,b
Encoder (Char)-Decoder (Label)	0.7974 ± 0.0050^a,b	0.8060 ± 0.0081^a,b	0.7690 ± 0.0062^a,b	0.7870 ± 0.0056^a,b
Encoder (Word)-Decoder (Label)	0.7892 ± 0.0069^a,b	0.7979 ± 0.0100^a,b	0.7595 ± 0.0066^a,b	0.7782 ± 0.0074^a,b
Encoder (Word)-Decoder (Word)	0.7904 ± 0.0079^a,b	0.7805 ± 0.0122^a,b	0.7594 ± 0.0078^a,b	0.7698 ± 0.0092^a,b
Encoder (Char)-Classification	0.7559 ± 0.0056^c	0.8560 ± 0.0125^c	0.7278 ± 0.0057^c	0.7866 ± 0.0058^c
Encoder (Word)-Classification	0.7652 ± 0.0042^c	0.8557 ± 0.0065^c	0.7364 ± 0.0038^c	0.7915 ± 0.0028^c
BERT-UniLM (Char)	0.8274 ± 0.0087^c	0.8152 ± 0.0115^c	0.8043 ± 0.0082^c	0.8097 ± 0.0094^c
BERT-UniLM (Label)	0.8248 ± 0.0045^c	0.8230 ± 0.0066^c	0.7970 ± 0.0056^c	0.8098 ± 0.0037^c
BERT-Classification	0.8568 ± 0.0029	0.8870 ± 0.0039	0.8298 ± 0.0037	0.8574 ± 0.0026

Note. The results are expressed as mean ± SD, and the threshold value of the sigmoid function was 0.1. ^a, compared with BERT-UniLM (Char); ^b, compared with BERT-UniLM (Label). ^c, compared with BERT-Classification.