Research Article

An Interpretable Classification Framework for Information Extraction from Online Healthcare Forums

Table 3

Model evaluation. We evaluate each model using 5-fold cross validation. Each of the average accuracy, weighted average precision, weighted average recall, and weighted average F-score for medication class, symptom class, and the overall performance is presented in each column. Each row represents the performance of each model trained on different feature combinations.

Ft. setM. Acc.M. Prec.M. Rec.M. F1.S. Acc.S. Prec.S. Rec.S. F1.Acc.Prec.Rec.F1.

Select + SVMWord-based0.8430.8460.8670.8560.8860.8750.8040.8380.7980.8080.7980.802
+ Semantic0.8510.8540.8710.8620.8840.8740.8010.8360.8040.8160.8040.808
+ Position0.8430.8460.8670.8560.8860.8750.8050.8380.7980.8080.7980.802
+ Thr. Crt.0.8440.8460.8670.8570.8960.8940.8140.8520.8000.8120.8000.805
+ Morpho.0.8480.8550.8640.8590.8910.8830.8110.8460.8010.8160.8010.807
+ Word Cnt.0.8020.7850.8710.8260.8640.8880.7220.7960.7610.7730.7610.763
LSP0.7990.8940.7090.7900.8310.8620.6440.7370.6910.8210.6910.731
+ Semantic0.8490.8650.8520.8580.8910.8780.8180.8460.8060.8230.8060.813
+ Position0.8410.8510.8520.8510.8930.8830.8170.8480.8000.8150.8000.806
+ Thr. Crt.0.8440.8520.8590.8550.8970.8850.8260.8550.8010.8140.8010.807
+ Morpho.0.8510.8600.8640.8610.8960.8830.8260.8540.8080.8200.8080.813
+ Word Cnt.0.8480.8560.8620.8590.8970.8840.8300.8560.8070.8190.8070.812
+ Word-based0.8100.8100.8440.8260.8700.8870.7390.8060.7680.7920.7680.776

LassoWord-based0.7940.7300.9790.8370.8860.9690.7120.8200.7910.7850.7910.756
+ Semantic0.7930.7410.9470.8310.8860.9230.7520.8280.7890.7540.7890.757
+ Position0.7950.7420.9470.8320.8860.9200.7540.8290.7900.7570.7900.758
+ Thr. Crt.0.7960.7450.9450.8330.8890.9220.7620.8340.7910.7560.7910.759
+ Morpho.0.7970.7450.9470.8340.8890.9240.7590.8330.7920.7570.7920.760
+ Word Cnt.0.7980.7460.9470.8340.8910.9270.7620.8360.7930.7590.7930.762
LSP0.7150.6630.9550.7820.8020.8750.5380.6660.7110.6780.7110.665
+ Semantic0.7690.7120.9550.8160.8610.9110.6890.7850.7670.7270.7670.728
+ Position0.7670.7100.9550.8140.8600.9100.6860.7820.7650.7160.7650.725
+ Thr. Crt.0.7710.7150.9530.8170.8640.9110.7000.7910.7690.7280.7690.731
+ Morpho.0.7710.7150.9530.8170.8640.9100.6980.7900.7690.7280.7690.730
+ Word Cnt.0.7710.7150.9530.8170.8640.9100.6980.7900.7690.7280.7690.730
+ Word-based0.7990.7450.9500.8350.8930.9300.7650.8390.7950.7590.7950.763

Forest-basedWord-based0.8480.7950.9690.8730.8810.8910.7730.8270.8190.8080.8190.795
+ Semantic0.8150.7610.9560.8470.8780.9010.7510.8190.8020.8050.8020.778
+ Position0.8200.7670.9570.8510.8870.9080.7720.8330.8070.7910.8070.779
+ Thr. Crt.0.8170.7650.9490.8470.8720.8840.7490.8110.7990.7920.7990.774
+ Morpho.0.8320.7760.9650.8600.8900.9070.7810.8380.8160.8150.8160.789
+ Word Cnt.0.8300.7790.9540.8580.8930.8930.8040.8460.8140.7970.8140.783
LSP0.7860.7420.9210.8220.8630.8610.7480.8010.7710.7250.7710.739
+ Semantic0.8370.8240.8870.8540.8790.8600.8020.8290.8090.8050.8090.805
+ Position0.8400.8360.8730.8540.8820.8440.8340.8390.8080.8000.8080.803
+ Thr. Crt.0.8320.8250.8750.8490.8790.8490.8140.8310.8020.7960.8020.797
+ Morpho.0.8410.8290.8860.8560.8810.8430.8320.8370.8120.8020.8120.804
+ Word Cnt.0.8290.8160.8810.8470.8800.8560.8080.8310.8000.7910.8000.793
+ Word-based0.8480.8160.9270.8680.8870.8610.8270.8430.8210.8030.8210.802