Table 4: Overall results on three datasets.

DatasetκγθRecallPrecisionF1 score

Obesity challengeFalse0.10.20.8050.9250.861
Baseline0.7710.8150.787
Medication challengeFalse0.10.350.6360.8380.724
Baseline0.7940.8450.818
OPERAM medical conditionsTrue0.10.50.5940.2710.373
OPERAM medicationsFalse00.350.7950.8160.805

Average of the top 5 best systems from the challenge.