Conference of the European Association for Machine Translation (2020)


pdf (full)
bib (full)
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

pdf bib
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation
André Martins | Helena Moniz | Sara Fumega | Bruno Martins | Fernando Batista | Luisa Coheur | Carla Parra | Isabel Trancoso | Marco Turchi | Arianna Bisazza | Joss Moorkens | Ana Guerberof | Mary Nurminen | Lena Marg | Mikel L. Forcada

pdf bib
Efficiently Reusing Old Models Across Languages via Transfer Learning
Tom Kocmi | Ondřej Bojar

Recent progress in neural machine translation (NMT) is directed towards larger neural networks trained on an increasing amount of hardware resources. As a result, NMT models are costly to train, both financially, due to the electricity and hardware cost, and environmentally, due to the carbon footprint. It is especially true in transfer learning for its additional cost of training the parent model before transferring knowledge and training the desired child model. In this paper, we propose a simple method of re-using an already trained model for different language pairs where there is no need for modifications in model architecture. Our approach does not need a separate parent model for each investigated language pair, as it is typical in NMT transfer learning. To show the applicability of our method, we recycle a Transformer model trained by different researchers and use it to seed models for different language pairs. We achieve better translation quality and shorter convergence times than when training from random initialization.

pdf bib
Efficient Transfer Learning for Quality Estimation with Bottleneck Adapter Layer
Hao Yang | Minghan Wang | Ning Xie | Ying Qin | Yao Deng

The Predictor-Estimator framework for quality estimation (QE) is commonly used for its strong performance. Where the predictor and estimator works on feature extraction and quality evaluation, respectively. However, training the predictor from scratch is computationally expensive. In this paper, we propose an efficient transfer learning framework to transfer knowledge from NMT dataset into QE models. A Predictor-Estimator alike model named BAL-QE is also proposed, aiming to extract high quality features with pre-trained NMT model, and make classification with a fine-tuned Bottleneck Adapter Layer (BAL). The experiment shows that BAL-QE achieves 97 % of the SOTA performance in WMT19 En-De and En-Ru QE tasks by only training 3 % of parameters within 4 hours on 4 Titan XP GPUs. Compared with the commonly used NuQE baseline, BAL-QE achieves 47 % (En-Ru) and 75 % (En-De) of performance promotions.

pdf bib
Incorporating External Annotation to improve Named Entity Translation in NMTNMT
Maciej Modrzejewski | Miriam Exel | Bianka Buschbeck | Thanh-Le Ha | Alexander Waibel

The correct translation of named entities (NEs) still poses a challenge for conventional neural machine translation (NMT) systems. This study explores methods incorporating named entity recognition (NER) into NMT with the aim to improve named entity translation. It proposes an annotation method that integrates named entities and insideoutsidebeginning (IOB) tagging into the neural network input with the use of source factors. Our experiments on EnglishGerman and English Chinese show that just by including different NE classes and IOB tagging, we can increase the BLEU score by around 1 point using the standard test set from WMT2019 and achieve up to 12 % increase in NE translation rates over a strong baseline.

pdf bib
A multi-source approach for BretonFrench hybrid machine translationBreton–French hybrid machine translation
Víctor M. Sánchez-Cartagena | Mikel L. Forcada | Felipe Sánchez-Martínez

Corpus-based approaches to machine translation (MT) have difficulties when the amount of parallel corpora to use for training is scarce, especially if the languages involved in the translation are highly inflected. This problem can be addressed from different perspectives, including data augmentation, transfer learning, and the use of additional resources, such as those used in rule-based MT. This paper focuses on the hybridisation of rule-based MT and neural MT for the BretonFrench under-resourced language pair in an attempt to study to what extent the rule-based MT resources help improve the translation quality of the neural MT system for this particular under-resourced language pair. We combine both translation approaches in a multi-source neural MT architecture and find out that, even though the rule-based system has a low performance according to automatic evaluation metrics, using it leads to improved translation quality.

pdf bib
Leveraging Multilingual Resources for Language Invariant Sentiment Analysis
Allen Antony | Arghya Bhattacharya | Jaipal Goud | Radhika Mamidi

Sentiment analysis is a widely researched NLP problem with state-of-the-art solutions capable of attaining human-like accuracies for various languages. However, these methods rely heavily on large amounts of labeled data or sentiment weighted language-specific lexical resources that are unavailable for low-resource languages. Our work attempts to tackle this data scarcity issue by introducing a neural architecture for language invariant sentiment analysis capable of leveraging various monolingual datasets for training without any kind of cross-lingual supervision. The proposed architecture attempts to learn language agnostic sentiment features via adversarial training on multiple resource-rich languages which can then be leveraged for inferring sentiment information at a sentence level on a low resource language. Our model outperforms the current state-of-the-art methods on the Multilingual Amazon Review Text Classification dataset [ REF ] and achieves significant performance gains over prior work on the low resource Sentiraama corpus [ REF ]. A detailed analysis of our research highlights the ability of our architecture to perform significantly well in the presence of minimal amounts of training data for low resource languages.

pdf bib
Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions
Yuting Zhao | Mamoru Komachi | Tomoyuki Kajiwara | Chenhui Chu

Existing studies on multimodal neural machine translation (MNMT) have mainly focused on the effect of combining visual and textual modalities to improve translations. However, it has been suggested that the visual modality is only marginally beneficial. Conventional visual attention mechanisms have been used to select the visual features from equally-sized grids generated by convolutional neural networks (CNNs), and may have had modest effects on aligning the visual concepts associated with textual objects, because the grid visual features do not capture semantic information. In contrast, we propose the application of semantic image regions for MNMT by integrating visual and textual features using two individual attention mechanisms (double attention). We conducted experiments on the Multi30k dataset and achieved an improvement of 0.5 and 0.9 BLEU points for English-German and English-French translation tasks, compared with the MNMT with grid visual features. We also demonstrated concrete improvements on translation performance benefited from semantic image regions.

pdf bib
Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-ChineseEnglish-to-Chinese
Yuying Ye | Antonio Toral

This research presents a fine-grained human evaluation to compare the Transformer and recurrent approaches to neural machine translation (MT), on the translation direction English-to-Chinese. To this end, we develop an error taxonomy compliant with the Multidimensional Quality Metrics (MQM) framework that is customised to the relevant phenomena of this translation direction. We then conduct an error annotation using this customised error taxonomy on the output of state-of-the-art recurrent- and Transformer-based MT systems on a subset of WMT2019’s news test set. The resulting annotation shows that, compared to the best recurrent system, the best Transformer system results in a 31 % reduction of the total number of errors and it produced significantly less errors in 10 out of 22 error categories. We also note that two of the systems evaluated do not produce any error for a category that was relevant for this translation direction prior to the advent of NMT systems : Chinese classifiers.

pdf bib
Correct Me If You Can : Learning from Error Corrections and Markings
Julia Kreutzer | Nathaniel Berger | Stefan Riezler

Sequence-to-sequence learning involves a trade-off between signal strength and annotation cost of training data. For example, machine translation data range from costly expert-generated translations that enable supervised learning, to weak quality-judgment feedback that facilitate reinforcement learning. We present the first user study on annotation cost and machine learnability for the less popular annotation mode of error markings. We show that error markings for translations of TED talks from English to German allow precise credit assignment while requiring significantly less human effort than correcting / post-editing, and that error-marked data can be used successfully to fine-tune neural machine translation models.

pdf bib
Fine-Grained Error Analysis on English-to-Japanese Machine Translation in the Medical DomainEnglish-to-Japanese Machine Translation in the Medical Domain
Takeshi Hayakawa | Yuki Arase

We performed a detailed error analysis in domain-specific neural machine translation (NMT) for the English and Japanese language pair with fine-grained manual annotation. Despite its importance for advancing NMT technologies, research on the performance of domain-specific NMT and non-European languages has been limited. In this study, we designed an error typology based on the error types that were typically generated by NMT systems and might cause significant impact in technical translations : Addition, Omission, Mistranslation, Grammar, and Terminology. The error annotation was targeted to the medical domain and was performed by experienced professional translators specialized in medicine under careful quality control. The annotation detected 4,912 errors on 2,480 sentences, and the frequency and distribution of errors were analyzed. We found that the major errors in NMT were Mistranslation and Terminology rather than Addition and Omission, which have been reported as typical problems of NMT. Interestingly, more errors occurred in documents for professionals compared with those for the general public. The results of our annotation work will be published as a parallel corpus with error labels, which are expected to contribute to developing better NMT models, automatic evaluation metrics, and quality estimation models.

pdf bib
Modelling Source- and Target- Language Syntactic Information as Conditional Context in Interactive Neural Machine Translation
Kamal Kumar Gupta | Rejwanul Haque | Asif Ekbal | Pushpak Bhattacharyya | Andy Way

In interactive machine translation (MT), human translators correct errors in automatic translations in collaboration with the MT systems, which is seen as an effective way to improve the productivity gain in translation. In this study, we model source-language syntactic constituency parse and target-language syntactic descriptions in the form of supertags as conditional context for interactive prediction in neural MT (NMT). We found that the supertags significantly improve productivity gain in translation in interactive-predictive NMT (INMT), while syntactic parsing somewhat found to be effective in reducing human effort in translation. Furthermore, when we model this source- and target-language syntactic information together as the conditional context, both types complement each other and our fully syntax-informed INMT model statistically significantly reduces human efforts in a FrenchtoEnglish translation task, achieving 4.30 points absolute (corresponding to 9.18 % relative) improvement in terms of word prediction accuracy (WPA) and 4.84 points absolute (corresponding to 9.01 % relative) reduction in terms of word stroke ratio (WSR) over the baseline.

pdf bib
Evaluating the usefulness of neural machine translation for the Polish translators in the European CommissionPolish translators in the European Commission
Karolina Stefaniak

The mission of the Directorate General for Translation (DGT) is to provide high-quality translation to help the European Commission communicate with EU citizens. To this end DGT employs almost 2000 translators from all EU official languages. But while the demand for translation has been continuously growing, following a global trend, the number of translators has decreased. To cope with the demand, DGT extensively uses a CAT environment encompassing translation memories, terminology databases and recently also machine translation. This paper examines the benefits and risks of using neural machine translation to augment the productivity of inhouse DGT translators for the EnglishPolish language pair. Based on the analysis of a sample of NMTtranslated texts and on the observations of the working practices of Polish translators it is concluded that the possible productivity gain is still modest, while the risks to quality are quite substantial.

pdf bib
Terminology-Constrained Neural Machine Translation at SAPSAP
Miriam Exel | Bianka Buschbeck | Lauritz Brandt | Simona Doneva

This paper examines approaches to bias a neural machine translation model to adhere to terminology constraints in an industrial setup. In particular, we investigate variations of the approach by Dinu et al. (2019), which uses inline annotation of the target terms in the source segment plus source factor embeddings during training and inference, and compare them to constrained decoding. We describe the challenges with respect to terminology in our usage scenario at SAP and show how far the investigated methods can help to overcome them. We extend the original study to a new language pair and provide an in-depth evaluation including an error classification and a human evaluation.

pdf bib
Bifixer and Bicleaner : two open-source tools to clean your parallel data
Gema Ramírez-Sánchez | Jaume Zaragoza-Bernabeu | Marta Bañón | Sergio Ortiz Rojas

This paper shows the utility of two open-source tools designed for parallel data cleaning : Bifixer and Bicleaner. Already used to clean highly noisy parallel content from crawled multilingual websites, we evaluate their performance in a different scenario : cleaning publicly available corpora commonly used to train machine translation systems. We choose four EnglishPortuguese corpora which we plan to use internally to compute paraphrases at a later stage. We clean the four corpora using both tools, which are described in detail, and analyse the effect of some of the cleaning steps on them. We then compare machine translation training times and quality before and after cleaning these corpora, showing a positive impact particularly for the noisiest ones.

pdf bib
An English-Swahili parallel corpus and its use for neural machine translation in the news domainEnglish-Swahili parallel corpus and its use for neural machine translation in the news domain
Felipe Sánchez-Martínez | Víctor M. Sánchez-Cartagena | Juan Antonio Pérez-Ortiz | Mikel L. Forcada | Miquel Esplà-Gomis | Andrew Secker | Susie Coleman | Julie Wall

This paper describes our approach to create a neural machine translation system to translate between English and Swahili (both directions) in the news domain, as well as the process we followed to crawl the necessary parallel corpora from the Internet. We report the results of a pilot human evaluation performed by the news media organisations participating in the H2020 EU-funded project GoURMET.

pdf bib
A User Study of the Incremental Learning in NMTNMT
Miguel Domingo | Mercedes García-Martínez | Álvaro Peris | Alexandre Helle | Amando Estela | Laurent Bié | Francisco Casacuberta | Manuel Herranz

In the translation industry, human experts usually supervise and post-edit machine translation hypotheses. Adaptive neural machine translation systems, able to incrementally update the underlying models under an online learning regime, have been proven to be useful to improve the efficiency of this workflow. However, this incremental adaptation is somewhat unstable, and it may lead to undesirable side effects. One of them is the sporadic appearance of made-up words, as a byproduct of an erroneous application of subword segmentation techniques. In this work, we extend previous studies on on-the-fly adaptation of neural machine translation systems. We perform a user study involving professional, experienced post-editors, delving deeper on the aforementioned problems. Results show that adaptive systems were able to learn how to generate the correct translation for task-specific terms, resulting in an improvement of the user’s productivity. We also observed a close similitude, in terms of morphology, between made-up words and the words that were expected.

pdf bib
How do LSPs compute MT discounts? Presenting a company’s pipeline and its useLSPs compute MT discounts? Presenting a company’s pipeline and its use
Randy Scansani | Lamis Mhedhbi

In this paper we present a pipeline developed at Acolad to test a Machine Translation (MT) engine and compute the discount to be applied when its output is used in production. Our pipeline includes three main steps where quality and productivity are measured through automatic metrics, manual evaluation, and by keeping track of editing and temporal effort during a post-editing task. Thanks to this approach, it is possible to evaluate the output quality and compute an engine-specific discount. Our test pipeline tackles the complexity of transforming productivity measurements into discounts by comparing the outcome of each of the above-mentioned steps to an estimate of the average productivity of translation from scratch. The discount is obtained by subtracting the resulting coefficient from the per-word rate. After a description of the pipeline, the paper presents its application on four engines, discussing its results and showing that our method to estimate post-editing effort through manual evaluation seems to capture the actual productivity. The pipeline relies heavily on the work of professional post-editors, with the aim of creating a mutually beneficial cooperation between users and developers.

pdf bib
Comparing Post-editing based on Four Editing Actions against Translating with an Auto-Complete Feature
Félix Do Carmo

This article describes the results of a workshop in which 50 translators tested two experimental translation interfaces, as part of a project which aimed at studying the details of editing work. In this work, editing is defined as a selection of four actions : deleting, inserting, moving and replacing words. Four texts, machine-translated from English into European Portuguese, were post-edited in four different sessions in which each translator swapped between texts and two work modes. One of the work modes involved a typical auto-complete feature, and the other was based on the four actions. The participants answered surveys before, during and after the workshop. A descriptive analysis of the answers to the surveys and of the logs recorded during the experiments was performed. The four editing actions mode is shown to be more intrusive, but to allow for more planned decisions : although they take more time in this mode, translators hesitate less and make fewer edits. The article shows the usefulness of the approach for research on the editing task.

pdf bib
Document-Level Machine Translation Evaluation Project : Methodology, Effort and Inter-Annotator Agreement
Sheila Castilho

Document-level (doc-level) human eval-uation of machine translation (MT) has raised interest in the community after a fewattempts have disproved claims of human parity (Toral et al., 2018 ; Laubli et al.,2018). However, little is known about bestpractices regarding doc-level human evalu-ation. The goal of this project is to identifywhich methodologies better cope with i)the current state-of-the-art (SOTA) humanmetrics, ii) a possible complexity when as-signing a single score to a text consisted of‘good’ and ‘bad’ sentences, iii) a possibletiredness bias in doc-level set-ups, and iv)the difference in inter-annotator agreement(IAA) between sentence and doc-level set-ups.

pdf bib
CEF Data Marketplace : Powering a Long-term Supply of Language DataCEF Data Marketplace: Powering a Long-term Supply of Language Data
Amir Kamran | Dace Dzeguze | Jaap van der Meer | Milica Panic | Alessandro Cattelan | Daniele Patrioli | Luisa Bentivogli | Marco Turchi

We describe the CEF Data Marketplace project, which focuses on the development of a trading platform of translation data for language professionals : translators, machine translation (MT) developers, language service providers (LSPs), translation buyers and government bodies. The CEF Data Marketplace platform will be designed and built to manage and trade data for all languages and domains. This project will open a continuous and longterm supply of language data for MT and other machine learning applications.

pdf bib
MICE : a middleware layer for MTMICE: a middleware layer for MT
Joachim Van den Bogaert | Tom Vanallemeersch | Heidi Depraetere

The MICE project (2018-2020) will deliver a middleware layer for improving the output quality of the eTranslation system of EC’s Connecting Europe Facility through additional services, such as domain adaptation and named entity recognition. It will also deliver a user portal, allowing for human post-editing.

pdf bib
Neural Translation for the European Union (NTEU) ProjectEuropean Union (NTEU) Project
Laurent Bié | Aleix Cerdà-i-Cucó | Hans Degroote | Amando Estela | Mercedes García-Martínez | Manuel Herranz | Alejandro Kohan | Maite Melero | Tony O’Dowd | Sinéad O’Gorman | Mārcis Pinnis | Roberts Rozis | Riccardo Superbo | Artūrs Vasiļevskis

The Neural Translation for the European Union (NTEU) project aims to build a neural engine farm with all European official language combinations for eTranslation, without the necessity to use a high-resourced language as a pivot. NTEU started in September 2019 and will run until August 2021.

pdf bib
OCR, Classification & Machine Translation (OCCAM)OCR, Classification& Machine Translation (OCCAM)
Joachim Van den Bogaert | Arne Defauw | Frederic Everaert | Koen Van Winckel | Alina Kramchaninova | Anna Bardadym | Tom Vanallemeersch | Pavel Smrž | Michal Hradiš

The OCCAM project (Optical Character recognition, ClassificAtion & Machine Translation) aims at integrating the CEF (Connecting Europe Facility) Automated Translation service with image classification, Translation Memories (TMs), Optical Character Recognition (OCR), and Machine Translation (MT). It will support the automated translation of scanned business documents (a document format that, currently, can not be processed by the CEF eTranslation service) and will also lead to a tool useful for the Digital Humanities domain.

pdf bib
Assessing the Comprehensibility of Automatic Translations (ArisToCAT)ArisToCAT)
Lieve Macken | Margot Fonteyne | Arda Tezcan | Joke Daems

The ArisToCAT project aims to assess the comprehensibility of ‘raw’ (unedited) MT output for readers who can only rely on the MT output. In this project description, we summarize the main results of the project and present future work.

pdf bib
MTrill project : Machine Translation impact on language learningMTrill project: Machine Translation impact on language learning
Natália Resende | Andy Way

Over the last decades, massive research investments have been made in the development of machine translation (MT) systems (Gupta and Dhawan, 2019). This has brought about a paradigm shift in the performance of these language tools, leading to widespread use of popular MT systems (Gaspari and Hutchins, 2007). Although the first MT engines were used for gisting purposes, in recent years, there has been an increasing interest in using MT tools, especially the freely available online MT tools, for language teaching and learning (Clifford et al., 2013). The literature on MT and Computer Assisted Language Learning (CALL) shows that, over the years, MT systems have been facilitating language teaching and also language learning (Nin o, 2006). It has been shown that MT tools can increase awareness of grammatical linguistic features of a foreign language. Research also shows the positive role of MT systems in the development of writing skills in English as well as in improving communication skills in English(Garcia and Pena, 2011). However, to date, the cognitive impact of MT on language acquisition and on the syntactic aspects of language processing has not yet been investigated and deserves further scrutiny. The MTril project aims at filling this gap in the literature by examining whether MT is contributing to a central aspect of language acquisition : the so-called language binding, i.e., the ability to combine single words properly in a grammatical sentence (Heyselaar et al., 2017 ; Ferreira and Bock, 2006).