Proceedings of the Sixth Conference on Machine Translation

Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz (Editors)


Anthology ID:
2021.wmt-1
Month:
November
Year:
2021
Address:
Online
Venues:
EMNLP | WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
URL:
https://aclanthology.org/2021.wmt-1
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://aclanthology.org/2021.wmt-1.pdf

pdf bib
Proceedings of the Sixth Conference on Machine Translation
Loic Barrault | Ondrej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussa | Christian Federmann | Mark Fishel | Alexander Fraser | Markus Freitag | Yvette Graham | Roman Grundkiewicz | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | Tom Kocmi | Andre Martins | Makoto Morishita | Christof Monz

pdf bib
GTCOM Neural Machine Translation Systems for WMT21GTCOM Neural Machine Translation Systems for WMT21
Chao Bei | Hao Zong

This paper describes the Global Tone Communication Co., Ltd.’s submission of the WMT21 shared news translation task. We participate in six directions : English to / from Hausa, Hindi to / from Bengali and Zulu to / from Xhosa. Our submitted systems are unconstrained and focus on multilingual translation odel, backtranslation and forward-translation. We also apply rules and language model to filter monolingual, parallel sentences and synthetic sentences.

pdf bib
The TALP-UPC Participation in WMT21 News Translation Task : an mBART-based NMT ApproachTALP-UPC Participation in WMT21 News Translation Task: an mBART-based NMT Approach
Carlos Escolano | Ioannis Tsiamas | Christine Basta | Javier Ferrando | Marta R. Costa-jussa | José A. R. Fonollosa

This paper describes the submission to the WMT 2021 news translation shared task by the UPC Machine Translation group. The goal of the task is to translate German to French (De-Fr) and French to German (Fr-De). Our submission focuses on fine-tuning a pre-trained model to take advantage of monolingual data. We fine-tune mBART50 using the filtered data, and additionally, we train a Transformer model on the same data from scratch. In the experiments, we show that fine-tuning mBART50 results in 31.69 BLEU for De-Fr and 23.63 BLEU for Fr-De, which increases 2.71 and 1.90 BLEU accordingly, as compared to the model we train from scratch. Our final submission is an ensemble of these two models, further increasing 0.3 BLEU for Fr-De.

pdf bib
Mieind’s WMT 2021 SubmissionWMT 2021 Submission
Haukur Barri Símonarson | Vésteinn Snæbjarnarson | Pétur Orri Ragnarson | Haukur Jónsson | Vilhjalmur Thorsteinsson

We present Mieind’s submission for the EnglishIcelandic and IcelandicEnglish subsets of the 2021 WMT news translation task. Transformer-base models are trained for translation on parallel data to generate backtranslations teratively. A pretrained mBART-25 model is then adapted for translation using parallel data as well as the last backtranslation iteration. This adapted pretrained model is then used to re-generate backtranslations, and the training of the adapted model is continued.

pdf bib
Allegro.eu Submission to WMT21 News Translation TaskWMT21 News Translation Task
Mikołaj Koszowski | Karol Grzegorczyk | Tsimur Hadeliya

We submitted two uni-directional models, one for EnglishIcelandic direction and other for IcelandicEnglish direction. Our news translation system is based on the transformer-big architecture, it makes use of corpora filtering, back-translation and forward translation applied to parallel and monolingual data alike

pdf bib
Illinois Japanese English News Translation for WMT 2021Illinois Japanese English News Translation for WMT 2021
Giang Le | Shinka Mori | Lane Schwartz

This system paper describes an end-to-end NMT pipeline for the Japanese English news translation task as submitted to WMT 2021, where we explore the efficacy of techniques such as tokenizing with language-independent and language-dependent tokenizers, normalizing by orthographic conversion, creating a politeness-and-formality-aware model by implementing a tagger, back-translation, model ensembling, and n-best reranking. We use parallel corpora provided by WMT 2021 organizers for training, and development and test data from WMT 2020 for evaluation of different experiment models. The preprocessed corpora are trained with a Transformer neural network model. We found that combining various techniques described herein, such as language-independent BPE tokenization, incorporating politeness and formality tags, model ensembling, n-best reranking, and back-translation produced the best translation models relative to other experiment systems.\\leftrightarrow English news translation task as submitted to WMT 2021, where we explore the efficacy of techniques such as tokenizing with language-independent and language-dependent tokenizers, normalizing by orthographic conversion, creating a politeness-and-formality-aware model by implementing a tagger, back-translation, model ensembling, and n-best reranking. We use parallel corpora provided by WMT 2021 organizers for training, and development and test data from WMT 2020 for evaluation of different experiment models. The preprocessed corpora are trained with a Transformer neural network model. We found that combining various techniques described herein, such as language-independent BPE tokenization, incorporating politeness and formality tags, model ensembling, n-best reranking, and back-translation produced the best translation models relative to other experiment systems.

pdf bib
The Fujitsu DMATH Submissions for WMT21 News Translation and Biomedical Translation TasksDMATH Submissions for WMT21 News Translation and Biomedical Translation Tasks
Ander Martinez

This paper describes the Fujitsu DMATH systems used for WMT 2021 News Translation and Biomedical Translation tasks. We focused on low-resource pairs, using a simple system. We conducted experiments on English-Hausa, Xhosa-Zulu and English-Basque, and submitted the results for XhosaZulu in the News Translation Task, and EnglishBasque in the Biomedical Translation Task, abstract and terminology translation subtasks. Our system combines BPE dropout, sub-subword features and back-translation with a Transformer (base) model, achieving good results on the evaluation sets.

pdf bib
The University of Edinburgh’s Bengali-Hindi Submissions to the WMT21 News Translation TaskUniversity of Edinburgh’s Bengali-Hindi Submissions to the WMT21 News Translation Task
Proyag Pal | Alham Fikri Aji | Pinzhen Chen | Sukanta Sen

We describe the University of Edinburgh’s BengaliHindi constrained systems submitted to the WMT21 News Translation task. We submitted ensembles of Transformer models built with large-scale back-translation and fine-tuned on subsets of training data retrieved based on similarity to the target domain.\\leftrightarrowHindi constrained systems submitted to the WMT21 News Translation task. We submitted ensembles of Transformer models built with large-scale back-translation and fine-tuned on subsets of training data retrieved based on similarity to the target domain.

pdf bib
The Volctrans GLAT System : Non-autoregressive Translation Meets WMT21GLAT System: Non-autoregressive Translation Meets WMT21
Lihua Qian | Yi Zhou | Zaixiang Zheng | Yaoming Zhu | Zehui Lin | Jiangtao Feng | Shanbo Cheng | Lei Li | Mingxuan Wang | Hao Zhou

This paper describes the Volctrans’ submission to the WMT21 news translation shared task for German-English translation. We build a parallel (i.e., non-autoregressive) translation system using the Glancing Transformer, which enables fast and accurate parallel decoding in contrast to the currently prevailing autoregressive models. To the best of our knowledge, this is the first parallel translation system that can be scaled to such a practical scenario like WMT competition. More importantly, our parallel translation system achieves the best BLEU score (35.0) on German-English translation task, outperforming all strong autoregressive counterparts.

pdf bib
Tencent Translation System for the WMT21 News Translation TaskWMT21 News Translation Task
Longyue Wang | Mu Li | Fangxu Liu | Shuming Shi | Zhaopeng Tu | Xing Wang | Shuangzhi Wu | Jiali Zeng | Wen Zhang

This paper describes Tencent Translation systems for the WMT21 shared task. We participate in the news translation task on three language pairs : Chinese-English, English-Chinese and German-English. Our systems are built on various Transformer models with novel techniques adapted from our recent research work. First, we combine different data augmentation methods including back-translation, forward-translation and right-to-left training to enlarge the training data. We also apply language coverage bias, data rejuvenation and uncertainty-based sampling approaches to select content-relevant and high-quality data from large parallel and monolingual corpora. Expect for in-domain fine-tuning, we also propose a fine-grained one model one domain approach to model characteristics of different news genres at fine-tuning and decoding stages. Besides, we use greed-based ensemble algorithm and transductive ensemble method to further boost our systems. Based on our success in the last WMT, we continuously employed advanced techniques such as large batch training, data selection and data filtering. Finally, our constrained Chinese-English system achieves 33.4 case-sensitive BLEU score, which is the highest among all submissions. The German-English system is ranked at second place accordingly.

pdf bib
HW-TSC’s Participation in the WMT 2021 News Translation Shared TaskHW-TSC’s Participation in the WMT 2021 News Translation Shared Task
Daimeng Wei | Zongyao Li | Zhanglin Wu | Zhengzhe Yu | Xiaoyu Chen | Hengchao Shang | Jiaxin Guo | Minghan Wang | Lizhi Lei | Min Zhang | Hao Yang | Ying Qin

This paper presents the submission of Huawei Translate Services Center (HW-TSC) to the WMT 2021 News Translation Shared Task. We participate in 7 language pairs, including Zh / En, De / En, Ja / En, Ha / En, Is / En, Hi / Bn, and Xh / Zu in both directions under the constrained condition. We use Transformer architecture and obtain the best performance via multiple variants with larger parameter sizes. We perform detailed pre-processing and filtering on the provided large-scale bilingual and monolingual datasets. Several commonly used strategies are used to train our models, such as Back Translation, Forward Translation, Multilingual Translation, Ensemble Knowledge Distillation, etc. Our submission obtains competitive results in the final evaluation.

pdf bib
Small Model and In-Domain Data Are All You Need
Hui Zeng

I participated in the WMT shared news translation task and focus on one high resource language pair : English and Chinese (two directions, Chinese to English and English to Chinese). The submitted systems (ZengHuiMT) focus on data cleaning, data selection, back translation and model ensemble. The techniques I used for data filtering and selection include filtering by rules, language model and word alignment. I used a base translation model trained on initial corpus to obtain the target versions of the WMT21 test sets, then I used language models to find out the monolingual data that is most similar to the target version of test set, such monolingual data was then used to do back translation. On the test set, my best submitted systems achieve 35.9 and 32.2 BLEU for English to Chinese and Chinese to English directions respectively, which are quite high for a small model.

pdf bib
The Mininglamp Machine Translation System for WMT21WMT21
Shiyu Zhao | Xiaopu Li | Minghui Wu | Jie Hao

This paper describes Mininglamp neural machine translation systems of the WMT2021 news translation tasks. We have participated in eight directions translation tasks for news text including Chinese to / from English, Hausa to / from English, German to / from English and French to / from German. Our fundamental system was based on Transformer architecture, with wider or smaller construction for different news translation tasks. We mainly utilized the method of back-translation, knowledge distillation and fine-tuning to boost single model, while the ensemble was used to combine single models. Our final submission has ranked first for the English to / from Hausa task.

pdf bib
Improving Similar Language Translation With Transfer Learning
Ife Adebara | Muhammad Abdul-Mageed

We investigate transfer learning based on pre-trained neural machine translation models to translate between (low-resource) similar languages. This work is part of our contribution to the WMT 2021 Similar Languages Translation Shared Task where we submitted models for different language pairs, including French-Bambara, Spanish-Catalan, and Spanish-Portuguese in both directions. Our models for Catalan-Spanish (82.79 BLEU)and Portuguese-Spanish (87.11 BLEU) rank top 1 in the official shared task evaluation, and we are the only team to submit models for the French-Bambara pairs.

pdf bib
T4 T Solution : WMT21 Similar Language Task for the Spanish-Catalan and Spanish-Portuguese Language PairT4T Solution: WMT21 Similar Language Task for the Spanish-Catalan and Spanish-Portuguese Language Pair
Miguel Canals | Marc Raventós Tato

The main idea of this solution has been to focus on corpus cleaning and preparation and after that, use an out of box solution (OpenNMT) with its default published transformer model. To prepare the corpus, we have used set of standard tools (as Moses scripts or python packages), but also, among other python scripts, a python custom tokenizer with the ability to replace numbers for variables, solve the upper / lower case issue of the vocabulary and provide good segmentation for most of the punctuation. We also have started a line to clean corpus based on statistical probability estimation of source-target corpus, with unclear results. Also, we have run some tests with syllabical word segmentation, again with unclear results, so at the end, after word sentence tokenization we have used BPE SentencePiece for subword units to feed OpenNMT.

pdf bib
Similar Language Translation for Catalan, Portuguese and Spanish Using Marian NMTCatalan, Portuguese and Spanish Using Marian NMT
Reinhard Rapp

This paper describes the SEBAMAT contribution to the 2021 WMT Similar Language Translation shared task. Using the Marian neural machine translation toolkit, translation systems based on Google’s transformer architecture were built in both directions of CatalanSpanish and PortugueseSpanish. The systems were trained in two contrastive parameter settings (different vocabulary sizes for byte pair encoding) using only the parallel but not the comparable corpora provided by the shared task organizers. According to their official evaluation results, the SEBAMAT system turned out to be competitive with rankings among the top teams and BLEU scores between 38 and 47 for the language pairs involving Portuguese and between 76 and 80 for the language pairs involving Catalan.

pdf bib
Adapting Neural Machine Translation for Automatic Post-Editing
Abhishek Sharma | Prabhakar Gupta | Anil Nelakanti

Automatic post-editing (APE) models are usedto correct machine translation (MT) system outputs by learning from human post-editing patterns. We present the system used in our submission to the WMT’21 Automatic Post-Editing (APE) English-German (En-De) shared task. We leverage the state-of-the-art MT system (Ng et al., 2019) for this task. For further improvements, we adapt the MT model to the task domain by using WikiMatrix (Schwenket al., 2021) followed by fine-tuning with additional APE samples from previous editions of the shared task (WMT-16,17,18) and ensembling the models. Our systems beat the baseline on TER scores on the WMT’21 test set.

pdf bib
HW-TSC’s Participation in the WMT 2021 Triangular MT Shared TaskHW-TSC’s Participation in the WMT 2021 Triangular MT Shared Task
Zongyao Li | Daimeng Wei | Hengchao Shang | Xiaoyu Chen | Zhanglin Wu | Zhengzhe Yu | Jiaxin Guo | Minghan Wang | Lizhi Lei | Min Zhang | Hao Yang | Ying Qin

This paper presents the submission of Huawei Translation Service Center (HW-TSC) to WMT 2021 Triangular MT Shared Task. We participate in the Russian-to-Chinese task under the constrained condition. We use Transformer architecture and obtain the best performance via a variant with larger parameter sizes. We perform detailed data pre-processing and filtering on the provided large-scale bilingual data. Several strategies are used to train our models, such as Multilingual Translation, Back Translation, Forward Translation, Data Denoising, Average Checkpoint, Ensemble, Fine-tuning, etc. Our system obtains 32.5 BLEU on the dev set and 27.7 BLEU on the test set, the highest score among all submissions.

pdf bib
Transfer Learning with Shallow Decoders : BSC at WMT2021’s Multilingual Low-Resource Translation for Indo-European Languages Shared TaskBSC at WMT2021’s Multilingual Low-Resource Translation for Indo-European Languages Shared Task
Ksenia Kharitonova | Ona de Gibert Bonet | Jordi Armengol-Estapé | Mar Rodriguez i Alvarez | Maite Melero

This paper describes the participation of the BSC team in the WMT2021’s Multilingual Low-Resource Translation for Indo-European Languages Shared Task. The system aims to solve the Subtask 2 : Wikipedia cultural heritage articles, which involves translation in four Romance languages : Catalan, Italian, Occitan and Romanian. The submitted system is a multilingual semi-supervised machine translation model. It is based on a pre-trained language model, namely XLM-RoBERTa, that is later fine-tuned with parallel data obtained mostly from OPUS. Unlike other works, we only use XLM to initialize the encoder and randomly initialize a shallow decoder. The reported results are robust and perform well for all tested languages.

pdf bib
Back-translation for Large-Scale Multilingual Machine Translation
Baohao Liao | Shahram Khadivi | Sanjika Hewavitharana

This paper illustrates our approach to the shared task on large-scale multilingual machine translation in the sixth conference on machine translation (WMT-21). In this work, we aim to build a single multilingual translation system with a hypothesis that a universal cross-language representation leads to better multilingual translation performance. We extend the exploration of different back-translation methods from bilingual translation to multilingual translation. Better performance is obtained by the constrained sampling method, which is different from the finding of the bilingual translation. Besides, we also explore the effect of vocabularies and the amount of synthetic data. Surprisingly, the smaller size of vocabularies perform better, and the extensive monolingual English data offers a modest improvement. We submitted to both the small tasks and achieve the second place.

pdf bib
Maastricht University’s Large-Scale Multilingual Machine Translation System for WMT 2021WMT 2021
Danni Liu | Jan Niehues

We present our development of the multilingual machine translation system for the large-scale multilingual machine translation task at WMT 2021. Starting form the provided baseline system, we investigated several techniques to improve the translation quality on the target subset of languages. We were able to significantly improve the translation quality by adapting the system towards the target subset of languages and by generating synthetic data using the initial model. Techniques successfully applied in zero-shot multilingual machine translation (e.g. similarity regularizer) only had a minor effect on the final translation performance.

pdf bib
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared TaskMicrosoft for WMT21 Shared Task
Jian Yang | Shuming Ma | Haoyang Huang | Dongdong Zhang | Li Dong | Shaohan Huang | Alexandre Muzio | Saksham Singhal | Hany Hassan | Xia Song | Furu Wei

This report describes Microsoft’s machine translation systems for the WMT21 shared task on large-scale multilingual machine translation. We participated in all three evaluation tracks including Large Track and two Small Tracks where the former one is unconstrained and the latter two are fully constrained. Our model submissions to the shared task were initialized with DeltaLM, a generic pre-trained multilingual encoder-decoder model, and fine-tuned correspondingly with the vast collected parallel data and allowed data sources according to track settings, together with applying progressive learning and iterative back-translation approaches to further improve the performance. Our final submissions ranked first on three tracks in terms of the automatic evaluation metric.

pdf bib
HW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation TaskHW-TSC’s Participation in the WMT 2021 Large-Scale Multilingual Translation Task
Zhengzhe Yu | Daimeng Wei | Zongyao Li | Hengchao Shang | Xiaoyu Chen | Zhanglin Wu | Jiaxin Guo | Minghan Wang | Lizhi Lei | Min Zhang | Hao Yang | Ying Qin

This paper presents the submission of Huawei Translation Services Center (HW-TSC) to the WMT 2021 Large-Scale Multilingual Translation Task. We participate in Samll Track # 2, including 6 languages : Javanese (Jv), Indonesian (I d), Malay (Ms), Tagalog (Tl), Tamil (Ta) and English (En) with 30 directions under the constrained condition. We use Transformer architecture and obtain the best performance via multiple variants with larger parameter sizes. We train a single multilingual model to translate all the 30 directions. We perform detailed pre-processing and filtering on the provided large-scale bilingual and monolingual datasets. Several commonly used strategies are used to train our models, such as Back Translation, Forward Translation, Ensemble Knowledge Distillation, Adapter Fine-tuning. Our model obtains competitive results in the end.

pdf bib
Just Ask ! Evaluating Machine Translation by Asking and Answering Questions
Mateusz Krubiński | Erfan Ghadery | Marie-Francine Moens | Pavel Pecina

In this paper, we show that automatically-generated questions and answers can be used to evaluate the quality of Machine Translation (MT) systems. Building on recent work on the evaluation of abstractive text summarization, we propose a new metric for system-level MT evaluation, compare it with other state-of-the-art solutions, and show its robustness by conducting experiments for various MT directions.

pdf bib
Evaluating Multiway Multilingual NMT in the Turkic LanguagesNMT in the Turkic Languages
Jamshidbek Mirzakhalov | Anoop Babu | Aigiz Kunafin | Ahsan Wahab | Bekhzodbek Moydinboyev | Sardana Ivanova | Mokhiyakhon Uzokova | Shaxnoza Pulatova | Duygu Ataman | Julia Kreutzer | Francis Tyers | Orhan Firat | John Licato | Sriram Chellappan

Despite the increasing number of large and comprehensive machine translation (MT) systems, evaluation of these methods in various languages has been restrained by the lack of high-quality parallel corpora as well as engagement with the people that speak these languages. In this study, we present an evaluation of state-of-the-art approaches to training and evaluating MT systems in 22 languages from the Turkic language family, most of which being extremely under-explored. First, we adopt the TIL Corpus with a few key improvements to the training and the evaluation sets. Then, we train 26 bilingual baselines as well as a multi-way neural MT (MNMT) model using the corpus and perform an extensive analysis using automatic metrics as well as human evaluations. We find that the MNMT model outperforms almost all bilingual baselines in the out-of-domain test sets and finetuning the model on a downstream task of a single pair also results in a huge performance boost in both low- and high-resource scenarios. Our attentive analysis of evaluation criteria for MT models in Turkic languages also points to the necessity for further research in this direction. We release the corpus splits, test sets as well as models to the public.

pdf bib
DELA Corpus-A Document-Level Corpus Annotated with Context-Related IssuesDELA Corpus - A Document-Level Corpus Annotated with Context-Related Issues
Sheila Castilho | João Lucas Cavalheiro Camargo | Miguel Menezes | Andy Way

Recently, the Machine Translation (MT) community has become more interested in document-level evaluation especially in light of reactions to claims of human parity, since examining the quality at the level of the document rather than at the sentence level allows for the assessment of suprasentential context, providing a more reliable evaluation. This paper presents a document-level corpus annotated in English with context-aware issues that arise when translating from English into Brazilian Portuguese, namely ellipsis, gender, lexical ambiguity, number, reference, and terminology, with six different domains. The corpus can be used as a challenge test set for evaluation and as a training / testing corpus for MT as well as for deep linguistic analysis of context issues. To the best of our knowledge, this is the first corpus of its kind.

pdf bib
Improving Machine Translation of Rare and Unseen Word Senses
Viktor Hangya | Qianchu Liu | Dario Stojanovski | Alexander Fraser | Anna Korhonen

The performance of NMT systems has improved drastically in the past few years but the translation of multi-sense words still poses a challenge. Since word senses are not represented uniformly in the parallel corpora used for training, there is an excessive use of the most frequent sense in MT output. In this work, we propose CmBT (Contextually-mined Back-Translation), an approach for improving multi-sense word translation leveraging pre-trained cross-lingual contextual word representations (CCWRs). Because of their contextual sensitivity and their large pre-training data, CCWRs can easily capture word senses that are missing or very rare in parallel corpora used to train MT. Specifically, CmBT applies bilingual lexicon induction on CCWRs to mine sense-specific target sentences from a monolingual dataset, and then back-translates these sentences to generate a pseudo parallel corpus as additional training data for an MT system. We test the translation quality of ambiguous words on the MuCoW test suite, which was built to test the word sense disambiguation effectiveness of MT systems. We show that our system improves on the translation of difficult unseen and low frequency word senses.

pdf bib
Findings of the WMT 2021 Shared Task on Quality EstimationWMT 2021 Shared Task on Quality Estimation
Lucia Specia | Frédéric Blain | Marina Fomicheva | Chrysoula Zerva | Zhenhao Li | Vishrav Chaudhary | André F. T. Martins

We report the results of the WMT 2021 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word and sentence levels. This edition focused on two main novel additions : (i) prediction for unseen languages, i.e. zero-shot settings, and (ii) prediction of sentences with catastrophic errors. In addition, new data was released for a number of languages, especially post-edited data. Participating teams from 19 institutions submitted altogether 1263 systems to different task variants and language pairs.

pdf bib
Efficient Machine Translation with Model Pruning and Quantization
Maximiliana Behnke | Nikolay Bogoychev | Alham Fikri Aji | Kenneth Heafield | Graeme Nail | Qianqian Zhu | Svetlana Tchistiakova | Jelmer van der Linde | Pinzhen Chen | Sidharth Kashyap | Roman Grundkiewicz

We participated in all tracks of the WMT 2021 efficient machine translation task : single-core CPU, multi-core CPU, and GPU hardware with throughput and latency conditions. Our submissions combine several efficiency strategies : knowledge distillation, a simpler simple recurrent unit (SSRU) decoder with one or two layers, lexical shortlists, smaller numerical formats, and pruning. For the CPU track, we used quantized 8-bit models. For the GPU track, we experimented with FP16 and 8-bit integers in tensorcores. Some of our submissions optimize for size via 4-bit log quantization and omitting a lexical shortlist. We have extended pruning to more parts of the network, emphasizing component- and block-level pruning that actually improves speed unlike coefficient-wise pruning.

pdf bib
Lingua Custodia’s Participation at the WMT 2021 Machine Translation Using Terminologies Shared TaskWMT 2021 Machine Translation Using Terminologies Shared Task
Melissa Ailem | Jingshu Liu | Raheel Qader

This paper describes Lingua Custodia’s submission to the WMT21 shared task on machine translation using terminologies. We consider three directions, namely English to French, Russian, and Chinese. We rely on a Transformer-based architecture as a building block, and we explore a method which introduces two main changes to the standard procedure to handle terminologies. The first one consists in augmenting the training data in such a way as to encourage the model to learn a copy behavior when it encounters terminology constraint terms. The second change is constraint token masking, whose purpose is to ease copy behavior learning and to improve model generalization. Empirical results show that our method satisfies most terminology constraints while maintaining high translation quality.

pdf bib
Kakao Enterprise’s WMT21 Machine Translation Using Terminologies Task SubmissionWMT21 Machine Translation Using Terminologies Task Submission
Yunju Bak | Jimin Sun | Jay Kim | Sungwon Lyu | Changmin Lee

This paper describes Kakao Enterprise’s submission to the WMT21 shared Machine Translation using Terminologies task. We integrate terminology constraints by pre-training with target lemma annotations and fine-tuning with exact target annotations utilizing the given terminology dataset. This approach yields a model that achieves outstanding results in terms of both translation quality and term consistency, ranking first based on COMET in the EnFr language direction. Furthermore, we explore various methods such as back-translation, explicitly training terminologies as additional parallel data, and in-domain data selection.

pdf bib
The SPECTRANS System Description for the WMT21 Terminology TaskSPECTRANS System Description for the WMT21 Terminology Task
Nicolas Ballier | Dahn Cho | Bilal Faye | Zong-You Ke | Hanna Martikainen | Mojca Pecman | Guillaume Wisniewski | Jean-Baptiste Yunès | Lichao Zhu | Maria Zimina-Poirot

This paper discusses the WMT 2021 terminology shared task from a meta perspective. We present the results of our experiments using the terminology dataset and the OpenNMT (Klein et al., 2017) and JoeyNMT (Kreutzer et al., 2019) toolkits for the language direction English to French. Our experiment 1 compares the predictions of the two toolkits. Experiment 2 uses OpenNMT to fine-tune the model. We report our results for the task with the evaluation script but mostly discuss the linguistic properties of the terminology dataset provided for the task. We provide evidence of the importance of text genres across scores, having replicated the evaluation scripts.

pdf bib
Dynamic Terminology Integration for COVID-19 and Other Emerging DomainsCOVID-19 and Other Emerging Domains
Toms Bergmanis | Mārcis Pinnis

The majority of language domains require prudent use of terminology to ensure clarity and adequacy of information conveyed. While the correct use of terminology for some languages and domains can be achieved by adapting general-purpose MT systems on large volumes of in-domain parallel data, such quantities of domain-specific data are seldom available for less-resourced languages and niche domains. Furthermore, as exemplified by COVID-19 recently, no domain-specific parallel data is readily available for emerging domains. However, the gravity of this recent calamity created a high demand for reliable translation of critical information regarding pandemic and infection prevention. This work is part of WMT2021 Shared Task : Machine Translation using Terminologies, where we describe Tilde MT systems that are capable of dynamic terminology integration at the time of translation. Our systems achieve up to 94 % COVID-19 term use accuracy on the test set of the EN-FR language pair without having access to any form of in-domain information during system training.

pdf bib
CUNI Systems for WMT21 : Terminology Translation Shared TaskCUNI Systems for WMT21: Terminology Translation Shared Task
Josef Jon | Michal Novák | João Paulo Aires | Dusan Varis | Ondřej Bojar

This paper describes Charles University sub-mission for Terminology translation Shared Task at WMT21. The objective of this task is to design a system which translates certain terms based on a provided terminology database, while preserving high overall translation quality. We competed in English-French language pair. Our approach is based on providing the desired translations alongside the input sentence and training the model to use these provided terms. We lemmatize the terms both during the training and inference, to allow the model to learn how to produce correct surface forms of the words, when they differ from the forms provided in the terminology database. Our submission ranked second in Exact Match metric which evaluates the ability of the model to produce desired terms in the translation.

pdf bib
PROMT Systems for WMT21 Terminology Translation TaskPROMT Systems for WMT21 Terminology Translation Task
Alexander Molchanov | Vladislav Kovalenko | Fedor Bykov

This paper describes the PROMT submissions for the WMT21 Terminology Translation Task. We participate in two directions : English to French and English to Russian. Our final submissions are MarianNMT-based neural systems. We present two technologies for terminology translation : a modification of the Dinu et al. (2019) soft-constrained approach and our own approach called PROMT Smart Neural Dictionary (SmartND). We achieve good results in both directions.

pdf bib
SYSTRAN @ WMT 2021 : Terminology TaskSYSTRAN @ WMT 2021: Terminology Task
Minh Quang Pham | Josep Crego | Antoine Senellart | Dan Berrebbi | Jean Senellart

This paper describes SYSTRAN submissions to the WMT 2021 terminology shared task. We participate in the English-to-French translation direction with a standard Transformer neural machine translation network that we enhance with the ability to dynamically include terminology constraints, a very common industrial practice. Two state-of-the-art terminology insertion methods are evaluated based (i) on the use of placeholders complemented with morphosyntactic annotation and (ii) on the use of target constraints injected in the source stream. Results show the suitability of the presented approaches in the evaluated scenario where terminology is used in a system trained on generic data only.

pdf bib
TermMind : Alibaba’s WMT21 Machine Translation Using Terminologies Task SubmissionTermMind: Alibaba’s WMT21 Machine Translation Using Terminologies Task Submission
Ke Wang | Shuqin Gu | Boxing Chen | Yu Zhao | Weihua Luo | Yuqi Zhang

This paper describes our work in the WMT 2021 Machine Translation using Terminologies Shared Task. We participate in the shared translation terminologies task in English to Chinese language pair. To satisfy terminology constraints on translation, we use a terminology data augmentation strategy based on Transformer model. We used tags to mark and add the term translations into the matched sentences. We created synthetic terms using phrase tables extracted from bilingual corpus to increase the proportion of term translations in training data. Detailed pre-processing and filtering on data, in-domain finetuning and ensemble method are used in our system. Our submission obtains competitive results in the terminology-targeted evaluation.

pdf bib
FJWU Participation for the WMT21 Biomedical Translation TaskFJWU Participation for the WMT21 Biomedical Translation Task
Sumbal Naz | Sadaf Abdul Rauf | Sami Ul Haq

In this paper we present the FJWU’s system submitted to the biomedical shared task at WMT21. We prepared state-of-the-art multilingual neural machine translation systems for three languages (i.e. German, Spanish and French) with English as target language. Our NMT systems based on Transformer architecture, were trained on combination of in-domain and out-domain parallel corpora developed using Information Retrieval (IR) and domain adaptation techniques.

pdf bib
Huawei AARC’s Submissions to the WMT21 Biomedical Translation Task : Domain Adaption from a Practical PerspectiveAARC’s Submissions to the WMT21 Biomedical Translation Task: Domain Adaption from a Practical Perspective
Weixuan Wang | Wei Peng | Xupeng Meng | Qun Liu

This paper describes Huawei Artificial Intelligence Application Research Center’s neural machine translation systems and submissions to the WMT21 biomedical translation shared task. Four of the submissions achieve state-of-the-art BLEU scores based on the official-released automatic evaluation results (EN-FR, EN-IT and ZH-EN). We perform experiments to unveil the practical insights of the involved domain adaptation techniques, including finetuning order, terminology dictionaries, and ensemble decoding. Issues associated with overfitting and under-translation are also discussed.

pdf bib
HW-TSC’s Participation at WMT 2021 Quality Estimation Shared TaskHW-TSC’s Participation at WMT 2021 Quality Estimation Shared Task
Yimeng Chen | Chang Su | Yingtao Zhang | Yuxia Wang | Xiang Geng | Hao Yang | Shimin Tao | Guo Jiaxin | Wang Minghan | Min Zhang | Yujia Liu | Shujian Huang

This paper presents our work in WMT 2021 Quality Estimation (QE) Shared Task. We participated in all of the three sub-tasks, including Sentence-Level Direct Assessment (DA) task, Word and Sentence-Level Post-editing Effort task and Critical Error Detection task, in all language pairs. Our systems employ the framework of Predictor-Estimator, concretely with a pre-trained XLM-Roberta as Predictor and task-specific classifier or regressor as Estimator. For all tasks, we improve our systems by incorporating post-edit sentence or additional high-quality translation sentence in the way of multitask learning or encoding it with predictors directly. Moreover, in zero-shot setting, our data augmentation strategy based on Monte-Carlo Dropout brings up significant improvement on DA sub-task. Notably, our submissions achieve remarkable results over all tasks.

pdf bib
The JHU-Microsoft Submission for WMT21 Quality Estimation Shared TaskJHU-Microsoft Submission for WMT21 Quality Estimation Shared Task
Shuoyang Ding | Marcin Junczys-Dowmunt | Matt Post | Christian Federmann | Philipp Koehn

This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task. We only participate in Task 2 (post-editing effort estimation) of the shared task, focusing on the target-side word-level quality estimation. The techniques we experimented with include Levenshtein Transformer training and data augmentation with a combination of forward, backward, round-trip translation, and pseudo post-editing of the MT output. We demonstrate the competitiveness of our system compared to the widely adopted OpenKiwi-XLM baseline. Our system is also the top-ranking system on the MT MCC metric for the English-German language pair.

pdf bib
Papago’s Submission for the WMT21 Quality Estimation Shared TaskWMT21 Quality Estimation Shared Task
Seunghyun Lim | Hantae Kim | Hyunjoong Kim

This paper describes Papago submission to the WMT 2021 Quality Estimation Task 1 : Sentence-level Direct Assessment. Our multilingual Quality Estimation system explores the combination of Pretrained Language Models and Multi-task Learning architectures. We propose an iterative training pipeline based on pretraining with large amounts of in-domain synthetic data and finetuning with gold (labeled) data. We then compress our system via knowledge distillation in order to reduce parameters yet maintain strong performance. Our submitted multilingual systems perform competitively in multilingual and all 11 individual language pair settings including zero-shot.

pdf bib
NICT Kyoto Submission for the WMT’21 Quality Estimation Task : Multimetric Multilingual Pretraining for Critical Error DetectionNICT Kyoto Submission for the WMT’21 Quality Estimation Task: Multimetric Multilingual Pretraining for Critical Error Detection
Raphael Rubino | Atsushi Fujita | Benjamin Marie

This paper presents the NICT Kyoto submission for the WMT’21 Quality Estimation (QE) Critical Error Detection shared task (Task 3). Our approach relies mainly on QE model pretraining for which we used 11 language pairs, three sentence-level and three word-level translation quality metrics. Starting from an XLM-R checkpoint, we perform continued training by modifying the learning objective, switching from masked language modeling to QE oriented signals, before finetuning and ensembling the models. Results obtained on the test set in terms of correlation coefficient and F-score show that automatic metrics and synthetic data perform well for pretraining, with our submissions ranked first for two out of four language pairs. A deeper look at the impact of each metric on the downstream task indicates higher performance for token oriented metrics, while an ablation study emphasizes the usefulness of conducting both self-supervised and QE pretraining.

pdf bib
Direct Exploitation of Attention Weights for Translation Quality Estimation
Lisa Yankovskaya | Mark Fishel

The paper presents our submission to the WMT2021 Shared Task on Quality Estimation (QE). We participate in sentence-level predictions of human judgments and post-editing effort. We propose a glass-box approach based on attention weights extracted from machine translation systems. In contrast to the previous works, we directly explore attention weight matrices without replacing them with general metrics (like entropy). We show that some of our models can be trained with a small amount of a high-cost labelled data. In the absence of training data our approach still demonstrates a moderate linear correlation, when trained with synthetic data.

pdf bib
IST-Unbabel 2021 Submission for the Quality Estimation Shared TaskIST-Unbabel 2021 Submission for the Quality Estimation Shared Task
Chrysoula Zerva | Daan van Stigt | Ricardo Rei | Ana C Farinha | Pedro Ramos | José G. C. de Souza | Taisiya Glushkova | Miguel Vera | Fabio Kepler | André F. T. Martins

We present the joint contribution of IST and Unbabel to the WMT 2021 Shared Task on Quality Estimation. Our team participated on two tasks : Direct Assessment and Post-Editing Effort, encompassing a total of 35 submissions. For all submissions, our efforts focused on training multilingual models on top of OpenKiwi predictor-estimator architecture, using pre-trained multilingual encoders combined with adapters. We further experiment with and uncertainty-related objectives and features as well as training on out-of-domain direct assessment data.

pdf bib
The IICT-Yverdon System for the WMT 2021 Unsupervised MT and Very Low Resource Supervised MT TaskIICT-Yverdon System for the WMT 2021 Unsupervised MT and Very Low Resource Supervised MT Task
Àlex R. Atrio | Gabriel Luthier | Axel Fahy | Giorgos Vernikos | Andrei Popescu-Belis | Ljiljana Dolamic

In this paper, we present the systems submitted by our team from the Institute of ICT (HEIG-VD / HES-SO) to the Unsupervised MT and Very Low Resource Supervised MT task. We first study the improvements brought to a baseline system by techniques such as back-translation and initialization from a parent model. We find that both techniques are beneficial and suffice to reach performance that compares with more sophisticated systems from the 2020 task. We then present the application of this system to the 2021 task for low-resource supervised Upper Sorbian (HSB) to German translation, in both directions. Finally, we present a contrastive system for HSB-DE in both directions, and for unsupervised German to Lower Sorbian (DSB) translation, which uses multi-task training with various training schedules to improve over the baseline.

pdf bib
The LMU Munich Systems for the WMT21 Unsupervised and Very Low-Resource Translation TaskLMU Munich Systems for the WMT21 Unsupervised and Very Low-Resource Translation Task
Jindřich Libovický | Alexander Fraser

We present our submissions to the WMT21 shared task in Unsupervised and Very Low Resource machine translation between German and Upper Sorbian, German and Lower Sorbian, and Russian and Chuvash. Our low-resource systems (GermanUpper Sorbian, RussianChuvash) are pre-trained on high-resource pairs of related languages. We fine-tune those systems using the available authentic parallel data and improve by iterated back-translation. The unsupervised GermanLower Sorbian system is initialized by the best Upper Sorbian system and improved by iterated back-translation using monolingual data only.

pdf bib
cushLEPOR : customising hLEPOR metric using Optuna for higher agreement with human judgments or pre-trained language model LaBSELEPOR: customising hLEPOR metric using Optuna for higher agreement with human judgments or pre-trained language model LaBSE
Lifeng Han | Irina Sorokina | Gleb Erofeev | Serge Gladkoff

Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python version we developed (ported) which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses Optuna hyper-parameter optimisation framework to fine-tune hLEPOR weighting parameters towards better agreement to pre-trained language models (using LaBSE) regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards professional human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU. Official results show that our submissions win three language pairs including English-German and Chinese-English on News domain via cushLEPOR(LM) and English-Russian on TED domain via hLEPOR. (data available at https://github.com/poethan/cushLEPOR)

pdf bib
MTEQA at WMT21 Metrics Shared TaskMTEQA at WMT21 Metrics Shared Task
Mateusz Krubiński | Erfan Ghadery | Marie-Francine Moens | Pavel Pecina

In this paper, we describe our submission to the WMT 2021 Metrics Shared Task. We use the automatically-generated questions and answers to evaluate the quality of Machine Translation (MT) systems. Our submission builds upon the recently proposed MTEQA framework. Experiments on WMT20 evaluation datasets show that at the system-level the MTEQA metric achieves performance comparable with other state-of-the-art solutions, while considering only a certain amount of information from the whole translation.

pdf bib
Are References Really Needed? Unbabel-IST 2021 Submission for the Metrics Shared TaskIST 2021 Submission for the Metrics Shared Task
Ricardo Rei | Ana C Farinha | Chrysoula Zerva | Daan van Stigt | Craig Stewart | Pedro Ramos | Taisiya Glushkova | André F. T. Martins | Alon Lavie

In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics Shared Task. With this year’s focus on Multidimensional Quality Metric (MQM) as the ground-truth human assessment, our aim was to steer COMET towards higher correlations with MQM. We do so by first pre-training on Direct Assessments and then fine-tuning on z-normalized MQM scores. In our experiments we also show that reference-free COMET models are becoming competitive with reference-based models, even outperforming the best COMET model from 2020 on this year’s development data. Additionally, we present COMETinho, a lightweight COMET model that is 19x faster on CPU than the original model, while also achieving state-of-the-art correlations with MQM. Finally, in the QE as a metric track, we also participated with a QE model trained using the OpenKiwi framework leveraging MQM scores and word-level annotations.

pdf bib
Simultaneous Neural Machine Translation with Constituent Label Prediction
Yasumasa Kano | Katsuhito Sudoh | Satoshi Nakamura

Simultaneous translation is a task in which translation begins before the speaker has finished speaking, so it is important to decide when to start the translation process. However, deciding whether to read more input words or start to translate is difficult for language pairs with different word orders such as English and Japanese. Motivated by the concept of pre-reordering, we propose a couple of simple decision rules using the label of the next constituent predicted by incremental constituent label prediction. In experiments on English-to-Japanese simultaneous translation, the proposed method outperformed baselines in the quality-latency trade-off.

pdf bib
Contrastive Learning for Context-aware Neural Machine Translation Using Coreference Information
Yongkeun Hwang | Hyeongu Yun | Kyomin Jung

Context-aware neural machine translation (NMT) incorporates contextual information of surrounding texts, that can improve the translation quality of document-level machine translation. Many existing works on context-aware NMT have focused on developing new model architectures for incorporating additional contexts and have shown some promising results. However, most of existing works rely on cross-entropy loss, resulting in limited use of contextual information. In this paper, we propose CorefCL, a novel data augmentation and contrastive learning scheme based on coreference between the source and contextual sentences. By corrupting automatically detected coreference mentions in the contextual sentence, CorefCL can train the model to be sensitive to coreference inconsistency. We experimented with our method on common context-aware NMT models and two document-level translation tasks. In the experiments, our method consistently improved BLEU of compared models on English-German and English-Korean tasks. We also show that our method significantly improves coreference resolution in the English-German contrastive test suite.