Journal of Healthcare Engineering

Research Article

Distant Supervision with Transductive Learning for Adverse Drug Reaction Identification from Electronic Medical Records

Table 1

A list of previous studies on ADR identification from unstructured text.


Data source	Literature	Year	Size	Label number	Labeling method	NER	Method

Supervised learning
EMR	Aramaki et al. [10]	2010	3012 notes	A, O (2)	H	CRF	Pattern-based
	Sohn et al. [11]	2011	237 notes	A, O (2)	H	cTAKES	Pattern-based, DT C4.5
	Henriksson et al. [26]	2015	400 notes	A, I, O (3)	H	CRF	Word embedding, RF
	Casillas et al. [12]	2016	n/a	A, O (2)	H	FreeLing-Med	Pattern-based, SVM, RF
Literature	Peng et al. [16]	2016	18,410 abstracts	A, O (2)	H, DS	Dictionary, tmChem, DNorm	Feature-based, SVM
Social media	Segura-Bedmar et al. [33]	2015	84,000 messages	A, I (2)	DS	GATE	Shallow linguistic kernel, distant supervision
	Nikfarjam et al. [17]	2015	8800 blog sentences, 3200 tweets	A, I, O (3)	H	CRF	Word embedding, CRF
	Jenhani et al. [18]	2016	80,000 tweets	A, O (2)	R, ODIN	Dictionary, Stanford CoreNLP	Rule-base, feature-based, DT, SVM, LR, NB
	Liu et al. [34]	2016	1800 blog sentences, 500 tweets	A, O (2)	H	MetaMap	Feature-based, tree kernel-based, ensemble method

Semisupervised learning
EMR	Taewijit and Theeramunkong [13]	2016	1.5 M sentences	A, I (2)	DS	MetaMap	Distant supervision, OpenIE [35], pattern-based
Literature	Kang et al. [36]	2014	1644 abstracts	A, O (2)	H	Peregrine	Hierarchical graph-based, shortest path
Social media	Liu and Chen [37]	2015	400 sentences	A, I, O (3)	H	MetaMap	Dependency tree, TSVM [38]

Unsupervised learning
EMR	Wang et al. [39]	2009	25,074 notes	None	None	MedLEE	Co-occurrence
Literature	Xu and Wang [14]	2014	119 M sentences	None	None	Parse tree	Pattern-based, ranking
Social media	Feldman et al. [15]	2015	0.1~1 M messages	None	None	Dictionary, pattern	HPSG-based parser, postprocessing of relation merging

Labels: A = ADR; I = IND; O = other (ADR cause, ADR outcome, non-ADR, negated ADR, others); labeling method: DS = distant supervision, H = human; R = rule-based.