Journal of Healthcare Engineering

Research Article

An Interpretable Classification Framework for Information Extraction from Online Healthcare Forums

Table 3

Model evaluation. We evaluate each model using 5-fold cross validation. Each of the average accuracy, weighted average precision, weighted average recall, and weighted average F-score for medication class, symptom class, and the overall performance is presented in each column. Each row represents the performance of each model trained on different feature combinations.


	Ft. set	M. Acc.	M. Prec.	M. Rec.	M. F1.	S. Acc.	S. Prec.	S. Rec.	S. F1.	Acc.	Prec.	Rec.	F1.

Select + SVM	Word-based	0.843	0.846	0.867	0.856	0.886	0.875	0.804	0.838	0.798	0.808	0.798	0.802
	+ Semantic	0.851	0.854	0.871	0.862	0.884	0.874	0.801	0.836	0.804	0.816	0.804	0.808
	+ Position	0.843	0.846	0.867	0.856	0.886	0.875	0.805	0.838	0.798	0.808	0.798	0.802
	+ Thr. Crt.	0.844	0.846	0.867	0.857	0.896	0.894	0.814	0.852	0.800	0.812	0.800	0.805
	+ Morpho.	0.848	0.855	0.864	0.859	0.891	0.883	0.811	0.846	0.801	0.816	0.801	0.807
	+ Word Cnt.	0.802	0.785	0.871	0.826	0.864	0.888	0.722	0.796	0.761	0.773	0.761	0.763
	LSP	0.799	0.894	0.709	0.790	0.831	0.862	0.644	0.737	0.691	0.821	0.691	0.731
	+ Semantic	0.849	0.865	0.852	0.858	0.891	0.878	0.818	0.846	0.806	0.823	0.806	0.813
	+ Position	0.841	0.851	0.852	0.851	0.893	0.883	0.817	0.848	0.800	0.815	0.800	0.806
	+ Thr. Crt.	0.844	0.852	0.859	0.855	0.897	0.885	0.826	0.855	0.801	0.814	0.801	0.807
	+ Morpho.	0.851	0.860	0.864	0.861	0.896	0.883	0.826	0.854	0.808	0.820	0.808	0.813
	+ Word Cnt.	0.848	0.856	0.862	0.859	0.897	0.884	0.830	0.856	0.807	0.819	0.807	0.812
	+ Word-based	0.810	0.810	0.844	0.826	0.870	0.887	0.739	0.806	0.768	0.792	0.768	0.776

Lasso	Word-based	0.794	0.730	0.979	0.837	0.886	0.969	0.712	0.820	0.791	0.785	0.791	0.756
	+ Semantic	0.793	0.741	0.947	0.831	0.886	0.923	0.752	0.828	0.789	0.754	0.789	0.757
	+ Position	0.795	0.742	0.947	0.832	0.886	0.920	0.754	0.829	0.790	0.757	0.790	0.758
	+ Thr. Crt.	0.796	0.745	0.945	0.833	0.889	0.922	0.762	0.834	0.791	0.756	0.791	0.759
	+ Morpho.	0.797	0.745	0.947	0.834	0.889	0.924	0.759	0.833	0.792	0.757	0.792	0.760
	+ Word Cnt.	0.798	0.746	0.947	0.834	0.891	0.927	0.762	0.836	0.793	0.759	0.793	0.762
	LSP	0.715	0.663	0.955	0.782	0.802	0.875	0.538	0.666	0.711	0.678	0.711	0.665
	+ Semantic	0.769	0.712	0.955	0.816	0.861	0.911	0.689	0.785	0.767	0.727	0.767	0.728
	+ Position	0.767	0.710	0.955	0.814	0.860	0.910	0.686	0.782	0.765	0.716	0.765	0.725
	+ Thr. Crt.	0.771	0.715	0.953	0.817	0.864	0.911	0.700	0.791	0.769	0.728	0.769	0.731
	+ Morpho.	0.771	0.715	0.953	0.817	0.864	0.910	0.698	0.790	0.769	0.728	0.769	0.730
	+ Word Cnt.	0.771	0.715	0.953	0.817	0.864	0.910	0.698	0.790	0.769	0.728	0.769	0.730
	+ Word-based	0.799	0.745	0.950	0.835	0.893	0.930	0.765	0.839	0.795	0.759	0.795	0.763

Forest-based	Word-based	0.848	0.795	0.969	0.873	0.881	0.891	0.773	0.827	0.819	0.808	0.819	0.795
	+ Semantic	0.815	0.761	0.956	0.847	0.878	0.901	0.751	0.819	0.802	0.805	0.802	0.778
	+ Position	0.820	0.767	0.957	0.851	0.887	0.908	0.772	0.833	0.807	0.791	0.807	0.779
	+ Thr. Crt.	0.817	0.765	0.949	0.847	0.872	0.884	0.749	0.811	0.799	0.792	0.799	0.774
	+ Morpho.	0.832	0.776	0.965	0.860	0.890	0.907	0.781	0.838	0.816	0.815	0.816	0.789
	+ Word Cnt.	0.830	0.779	0.954	0.858	0.893	0.893	0.804	0.846	0.814	0.797	0.814	0.783
	LSP	0.786	0.742	0.921	0.822	0.863	0.861	0.748	0.801	0.771	0.725	0.771	0.739
	+ Semantic	0.837	0.824	0.887	0.854	0.879	0.860	0.802	0.829	0.809	0.805	0.809	0.805
	+ Position	0.840	0.836	0.873	0.854	0.882	0.844	0.834	0.839	0.808	0.800	0.808	0.803
	+ Thr. Crt.	0.832	0.825	0.875	0.849	0.879	0.849	0.814	0.831	0.802	0.796	0.802	0.797
	+ Morpho.	0.841	0.829	0.886	0.856	0.881	0.843	0.832	0.837	0.812	0.802	0.812	0.804
	+ Word Cnt.	0.829	0.816	0.881	0.847	0.880	0.856	0.808	0.831	0.800	0.791	0.800	0.793
	+ Word-based	0.848	0.816	0.927	0.868	0.887	0.861	0.827	0.843	0.821	0.803	0.821	0.802