Research Article
Chinese Personal Name Disambiguation Based on Clustering
Table 2
Data statistics for each personal name in corpus.
| Personal name | Number of documents | Discarded (gold standard) | Discarded (ICTCLAS) | Discarded (LTP) |
| Li Jun (李军) | 234 | 1 | 0 | 4 | Roger (罗杰) | 357 | 24 | 3 | 2 | Gao Jun (高军) | 300 | 82 | 84 | 107 | Sun Ming (孙明) | 207 | 2 | 2 | 2 | Zhang Jianjun (张建军) | 247 | 0 | 0 | 1 |
|
|