Research Article
Unsupervised Quality Estimation Model for English to German Translation and Its Application in Extensive Supervised Evaluation
Table 10
The performances on WMT13 shared tasks using Pearson correlation.
| System | Correlation score with human judgments | Other-to-English | English-to-other | CS-EN | DE-EN | ES-EN | FR-EN | RU-EN | Mean | EN-CS | EN-DE | EN-ES | EN-FR | EN-RU | Mean |
| | 0.81 | 0.96 | 0.90 | 0.96 | 0.71 | 0.87 | 0.76 | 0.94 | 0.91 | 0.91 | 0.77 | 0.86 | | 0.80 | 0.94 | 0.94 | 0.96 | 0.69 | 0.87 | 0.82 | 0.92 | 0.90 | 0.92 | 0.68 | 0.85 | METEOR | 0.99 | 0.96 | 0.97 | 0.98 | 0.84 | 0.95 | 0.82 | 0.88 | 0.88 | 0.91 | 0.55 | 0.81 | BLEU | 0.89 | 0.91 | 0.94 | 0.94 | 0.60 | 0.86 | 0.80 | 0.82 | 0.88 | 0.90 | 0.62 | 0.80 | TER | 0.77 | 0.87 | 0.91 | 0.93 | 0.80 | 0.80 | 0.70 | 0.73 | 0.78 | 0.91 | 0.61 | 0.75 |
|
|
Bold fonts mean the best performance.
|