Evidence-Based Complementary and Alternative Medicine

Research Article

Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine

Table 4

Model performance on HFDS test data sets.


Model	Accuracy	Precision	Recall	F1-score

Encoder (Char)-Decoder (Char)	0.8641 ± 0.0065^a,b	0.8656 ± 0.0084^a,b	0.8555 ± 0.0062^a,c	0.8605 ± 0.0056^a,b
Encoder (Char)-Decoder (Label)	0.8558 ± 0.0070^a,b	0.8727 ± 0.0038^a,b	0.8463 ± 0.0062^a,b	0.8593 ± 0.0043^a,b
Encoder (Word)-Decoder (Label)	0.8487 ± 0.0046^a,b	0.8678 ± 0.0076^a,b	0.8377 ± 0.0054^a,b	0.8525 ± 0.0059^a,b
Encoder (Word)-Decoder (Word)	0.8451 ± 0.0035^a,b	0.8472 ± 0.0056^a,b	0.8345 ± 0.0036^a,b	0.8408 ± 0.0023^a,b
Encoder (Char)-Classification	0.8311 ± 0.0078^c	0.8937 ± 0.0072^c	0.8342 ± 0.0077^c	0.8629 ± 0.0045^c
Encoder (Word)-Classification	0.8294 ± 0.0079^c	0.8983 ± 0.0055^c	0.8302 ± 0.0070^c	0.8629 ± 0.0038^c
BERT-UniLM (Char)	0.8914 ± 0.0059^c	0.8983 ± 0.0042^c	0.8855 ± 0.0077^c	0.8918 ± 0.0043^c
BERT-UniLM (Label)	0.8829 ± 0.0046^c	0.8909 ± 0.0069^c	0.8773 ± 0.0044^c	0.8840 ± 0.0036^c
BERT-Classification	0.9051 ± 0.0039	0.9118 ± 0.0033	0.9028 ± 0.0046	0.9073 ± 0.0033

Note. The results are expressed as mean ± SD, and the threshold value of the sigmoid function was 0.2. ^a, compared with BERT-UniLM (Char); ^b: , compared with BERT-UniLM (Label). ^c, compared with BERT-Classification.