Review Article

Applications of Natural Language Processing in Biodiversity Science

Table 4

Performance metrics for the names recognition and morphological character extraction algorithms reviewed. Recall and precision values may not be directly comparable between the different algorithms. NA: not available [30].

ToolRecallPrecisionTest CorporaReference

TaxonGrab>94%>96%Vol. 1 Birds of the Belgian Congo by Chapin[31]
FAT40.2%84.0%American Seashells by Abbott[32]
Taxon Finder54.3%97.5%American Seashells by Abbott[32]
Neti Neti70.5%98.9%American Seashells by Abbott[32]
LINNAEUS94.3%97.1%LINNAEUS gold standard data set[33]
Organism Tagger94.0%95.0%LINNAEUS gold standard data set[34]
X-tractNANAFlora of North America[35]
Worldwide Botanical Knowledge BaseNANAFlora of Chinahttp://wwbota.free.fr/
TerminatorNANA16 nematode descriptionshttp://www.math.ucdavis.edu/~milton/genisys/terminator.html
MultiFloramid 60%mid 70%Descriptions of Ranunculus spp. from six Florashttp://intranet.cs.man.ac.uk/ai/public/MultiFlora/MF1.html
MARTT98.0%58.0%Flora of North America and Flora of China[30]
WHISK33.33% to 79.65%72.52% to 100%Flora of North America[13]
CharaParser90.0%91.0%Flora of North America[36]