Research Article

ABioNER: A BERT-Based Model for Arabic Biomedical Named-Entity Recognition

Table 1

An overview of methods used for biomedical named-entity recognition.

PaperLanguageNEMethodResults

[8]EnglishGene proteinSVMBest balanced F1-score = 0.79
[9]EnglishChemical mentionsHybrid (CRF + dictionary)F1-score = 68.1
[10]EnglishProblem treatment test protein DNA RNA cell type cell lineUnsupervised learningOverall performance (exact micro-F) Pittsburgh dataset: 53.1
GENIA dataset: 39.5
[11]EnglishDiseaseHybrid (stacked ensemble + fuzzy matching)F1-score = 89.12%
[12]EnglishDiseaseMultiple label convolutional neural networksF1-score NCBI corpus: 85.17%
CDR corpus: 87.83%
[13]EnglishDocument-level chemical NERHybrid (attention-based BiLSTM-CRF)F1-score HEMDNER corpus: 91.14%
CDR corpus: 92.57%
[14]EnglishGenes diseases protein DNA RNA cell type cell linen-Gram character and word embeddings via convolutional neural networkF1-score: NCBI dataset: 87.26%
Biocreative II dataset: 87.26%
JNLPBA dataset: 72.57%
[15]EnglishChemicalsTransfer learningF1-score:
Diseases88.21
Species82.09
Gene87.01
Protein83.09
[16]EnglishDiseasesBidirectional encoder representations from transformersBest F1-score
89.71
75.31
Species
[17]Spanish/SwedishSpanish: disease/drug Swedish: body part/disorder/findingBidirectional long short-term memory networkAvg. F1-score: Spanish: 75.25
Swedish: 76.04
[2]ArabicDisease diagnosis symptoms treatment methodsBayesian belief network (BBN)Avg. F1-score: 71.05%