Research Article
Unsupervised Quality Estimation Model for English to German Translation and Its Application in Extensive Supervised Evaluation
Table 9
The performances on WMT13 shared tasks using Spearman rank correlation.
| System | Correlation score with human judgments | Other-to-English | English-to-other | CS-EN | DE-EN | ES-EN | FR-EN | RU-EN | Mean | EN-CS | EN-DE | EN-ES | EN-FR | EN-RU | Mean |
| | 0.80 | 0.93 | 0.75 | 0.95 | 0.79 | 0.84 | 0.75 | 0.90 | 0.84 | 0.90 | 0.85 | 0.85 | | 0.85 | 0.95 | 0.83 | 0.95 | 0.72 | 0.86 | 0.82 | 0.90 | 0.85 | 0.92 | 0.73 | 0.84 | METEOR | 0.96 | 0.96 | 0.98 | 0.98 | 0.81 | 0.94 | 0.94 | 0.88 | 0.78 | 0.92 | 0.57 | 0.82 | BLEU | 0.94 | 0.90 | 0.88 | 0.99 | 0.67 | 0.88 | 0.90 | 0.79 | 0.76 | 0.90 | 0.57 | 0.78 | TER | 0.80 | 0.83 | 0.83 | 0.95 | 0.60 | 0.80 | 0.86 | 0.85 | 0.75 | 0.91 | 0.54 | 0.78 |
|
|
Bold fonts mean the best performance.
|