Improving Transformer-Based Neural Machine Translation with Prior Alignments

<table class="algorithm-group"><tr><td><table class="algorithm" id="alg1"><tr><td>(1)</td><td>We tokenize both Vietnamese source sentences and English target sentences. We apply the types of tokens in the Transformer-S1 model as in the case of the Transformer-M model. A token in both source and target sentences is a sequence of characters delimited by spaces. Linguistically, Vietnamese-English Transformer-M and Transformer-S1 models are syllable-to-word models since spaces in Vietnamese delimit syllables and spaces in English delimit words.</td></tr><tr><td>(2)</td><td>We construct many-to-one alignments from Vietnamese to English, using the fast_align token aligner.</td></tr><tr><td>(3)</td><td>We repeat step 2 in the reverse direction from English to Vietnamese.</td></tr><tr><td>(4)</td><td>We merge the bidirectional alignments generated in steps 2 and 3, following grow-diagonal heuristics proposed by Koehn et al. [<a href="/journals/complexity/2021/5515407/#B24">24</a>].</td></tr></table></td></tr></table>

<div> Procedure to construct statistical alignments <span class="nowrap"><svg height="12.1358pt" id="M25" style="vertical-align:-3.29113pt" version="1.1" viewbox="-0.0498162 -8.84467 20.5688 12.1358" width="20.5688pt" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g transform="matrix(.013,0,0,-0.013,0,0)"><path d="M837 672C829 675 822 674 813 674C710 674 608 541 510 398C475 447 422 480 349 480C260 480 162 415 162 299C162 199 212 133 284 92C229 39 191 15 134 15C91 15 69 45 69 65H71C78 54 95 53 100 53C128 53 143 80 143 101C143 128 121 149 93 149C60 149 31 121 31 77C31 24 74 -15 138 -15C198 -15 255 22 311 79C360 56 416 43 473 39L455 0H552L569 42C612 49 661 63 693 87L684 103C657 88 617 76 581 69C677 301 753 477 846 658L837 672ZM772 638C701 530 628 379 559 227C558 279 547 330 526 371C609 494 688 609 771 638H772ZM536 199C536 191 536 184 535 176L486 67C430 71 380 84 335 105C395 173 451 257 507 344C527 301 536 251 536 199ZM492 372C431 277 369 183 310 119C245 158 203 217 203 301S275 447 349 447C416 447 461 417 492 372Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,11.115,3.132)"><path d="M463 633C453 636 427 644 417 648C388 660 349 667 316 667C183 667 100 592 100 483C100 398 176 335 223 305L259 282C321 242 358 203 358 149C358 66 301 22 234 22C110 22 70 131 57 208L24 205L28 48C51 23 128 -17 215 -17C342 -17 446 60 446 174C446 262 394 310 317 361L284 383C229 420 183 455 183 515C183 569 222 628 298 628C384 628 418 563 428 481L461 487C459 527 458 593 463 633Z"></path></g><g transform="matrix(.0091,0,0,-0.0091,15.465,3.132)"><path d="M389 0V32C297 38 291 46 291 118V635C234 613 175 595 109 583V556L161 554C203 552 207 547 207 497V118C207 46 201 38 110 32V0H389Z"></path></g></svg>.</span></div>

Complexity

alg1

Algorithm 1

Algorithm 1: Improving Transformer-Based Neural Machine Translation with Prior Alignments