Marcello Federico


2021

pdf bib
Towards Modeling the Style of Translators in Neural Machine Translation
Yue Wang | Cuong Hoang | Marcello Federico
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

One key ingredient of neural machine translation is the use of large datasets from different domains and resources (e.g. Europarl, TED talks). These datasets contain documents translated by professional translators using different but consistent translation styles. Despite that, the model is usually trained in a way that neither explicitly captures the variety of translation styles present in the data nor translates new data in different and controllable styles. In this work, we investigate methods to augment the state of the art Transformer model with translator information that is available in part of the training data. We show that our style-augmented translation models are able to capture the style variations of translators and to generate translations with different styles on new data. Indeed, the generated variations differ significantly, up to +4.5 BLEU score difference. Despite that, human evaluation confirms that the translations are of the same quality.

pdf bib
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Marcello Federico | Alex Waibel | Marta R. Costa-jussà | Jan Niehues | Sebastian Stuker | Elizabeth Salesky
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

pdf bib
FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGNFINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN
Antonios Anastasopoulos | Ondřej Bojar | Jacob Bremerman | Roldano Cattoni | Maha Elbayad | Marcello Federico | Xutai Ma | Satoshi Nakamura | Matteo Negri | Jan Niehues | Juan Pino | Elizabeth Salesky | Sebastian Stüker | Katsuhito Sudoh | Marco Turchi | Alexander Waibel | Changhan Wang | Matthew Wiesner
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks : (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation. A total of 22 teams participated in at least one of the tasks. This paper describes each shared task, data and evaluation metrics, and reports results of the received submissions.

2020

pdf bib
Proceedings of the 17th International Conference on Spoken Language Translation
Marcello Federico | Alex Waibel | Kevin Knight | Satoshi Nakamura | Hermann Ney | Jan Niehues | Sebastian Stüker | Dekai Wu | Joseph Mariani | Francois Yvon
Proceedings of the 17th International Conference on Spoken Language Translation

2019

pdf bib
The IWSLT 2019 Evaluation CampaignIWSLT 2019 Evaluation Campaign
Jan Niehues | Rolando Cattoni | Sebastian Stüker | Matteo Negri | Marco Turchi | Thanh-Le Ha | Elizabeth Salesky | Ramon Sanabria | Loic Barrault | Lucia Specia | Marcello Federico
Proceedings of the 16th International Conference on Spoken Language Translation

The IWSLT 2019 evaluation campaign featured three tasks : speech translation of (i) TED talks and (ii) How2 instructional videos from English into German and Portuguese, and (iii) text translation of TED talks from English into Czech. For the first two tasks we encouraged submissions of end- to-end speech-to-text systems, and for the second task participants could also use the video as additional input. We received submissions by 12 research teams. This overview provides detailed descriptions of the data and evaluation conditions of each task and reports results of the participating systems.

pdf bib
Adapting Multilingual Neural Machine Translation to Unseen Languages
Surafel M. Lakew | Alina Karakanta | Marcello Federico | Matteo Negri | Marco Turchi
Proceedings of the 16th International Conference on Spoken Language Translation

Multilingual Neural Machine Translation (MNMT) for low- resource languages (LRL) can be enhanced by the presence of related high-resource languages (HRL), but the relatedness of HRL usually relies on predefined linguistic assumptions about language similarity. Recently, adapting MNMT to a LRL has shown to greatly improve performance. In this work, we explore the problem of adapting an MNMT model to an unseen LRL using data selection and model adapta- tion. In order to improve NMT for LRL, we employ perplexity to select HRL data that are most similar to the LRL on the basis of language distance. We extensively explore data selection in popular multilingual NMT settings, namely in (zero-shot) translation, and in adaptation from a multilingual pre-trained model, for both directions (LRLen). We further show that dynamic adaptation of the model’s vocabulary results in a more favourable segmentation for the LRL in comparison with direct adaptation. Experiments show re- ductions in training time and significant performance gains over LRL baselines, even with zero LRL data (+13.0 BLEU), up to +17.0 BLEU for pre-trained multilingual model dynamic adaptation with related data selection. Our method outperforms current approaches, such as massively multilingual models and data augmentation, on four LRL.

pdf bib
Controlling the Output Length of Neural Machine Translation
Surafel Melaku Lakew | Mattia Di Gangi | Marcello Federico
Proceedings of the 16th International Conference on Spoken Language Translation

The recent advances introduced by neural machine translation (NMT) are rapidly expanding the application fields of machine translation, as well as reshaping the quality level to be targeted. In particular, if translations have to fit some given layout, quality should not only be measured in terms of adequacy and fluency, but also length. Exemplary cases are the translation of document files, subtitles, and scripts for dubbing, where the output length should ideally be as close as possible to the length of the input text. This pa-per addresses for the first time, to the best of our knowledge, the problem of controlling the output length in NMT. We investigate two methods for biasing the output length with a transformer architecture : i) conditioning the output to a given target-source length-ratio class and ii) enriching the transformer positional embedding with length information. Our experiments show that both methods can induce the network to generate shorter translations, as well as acquiring inter- pretable linguistic skills.

pdf bib
Training Neural Machine Translation to Apply Terminology Constraints
Georgiana Dinu | Prashant Mathur | Marcello Federico | Yaser Al-Onaizan
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

This paper proposes a novel method to inject custom terminology into neural machine translation at run time. Previous works have mainly proposed modifications to the decoding algorithm in order to constrain the output to include run-time-provided target terms. While being effective, these constrained decoding methods add, however, significant computational overhead to the inference step, and, as we show in this paper, can be brittle when tested in realistic conditions. In this paper we approach the problem by training a neural MT system to learn how to use custom terminology when provided with the input. Comparative experiments show that our method is not only more effective than a state-of-the-art implementation of constrained decoding, but is also as fast as constraint-free decoding.

2017

pdf bib
Overview of the IWSLT 2017 Evaluation CampaignIWSLT 2017 Evaluation Campaign
Mauro Cettolo | Marcello Federico | Luisa Bentivogli | Jan Niehues | Sebastian Stüker | Katsuhito Sudoh | Koichiro Yoshino | Christian Federmann
Proceedings of the 14th International Conference on Spoken Language Translation

The IWSLT 2017 evaluation campaign has organised three tasks. The Multilingual task, which is about training machine translation systems handling many-to-many language directions, including so-called zero-shot directions. The Dialogue task, which calls for the integration of context information in machine translation, in order to resolve anaphoric references that typically occur in human-human dialogue turns. And, finally, the Lecture task, which offers the challenge of automatically transcribing and translating real-life university lectures. Following the tradition of these reports, we will described all tasks in detail and present the results of all runs submitted by their participants.

pdf bib
FBK’s Multilingual Neural Machine Translation System for IWSLT 2017FBK’s Multilingual Neural Machine Translation System for IWSLT 2017
Surafel M. Lakew | Quintino F. Lotito | Marco Turchi | Matteo Negri | Marcello Federico
Proceedings of the 14th International Conference on Spoken Language Translation

Neural Machine Translation has been shown to enable inference and cross-lingual knowledge transfer across multiple language directions using a single multilingual model. Focusing on this multilingual translation scenario, this work summarizes FBK’s participation in the IWSLT 2017 shared task. Our submissions rely on two multilingual systems trained on five languages (English, Dutch, German, Italian, and Romanian). The first one is a 20 language direction model, which handles all possible combinations of the five languages. The second multilingual system is trained only on 16 directions, leaving the others as zero-shot translation directions (i.e representing a more complex inference task on language pairs not seen at training time). More specifically, our zero-shot directions are Dutch$German and Italian$Romanian (resulting in four language combinations). Despite the small amount of parallel data used for training these systems, the resulting multilingual models are effective, even in comparison with models trained separately for every language pair (i.e. in more favorable conditions). We compare and show the results of the two multilingual models against a baseline single language pair systems. Particularly, we focus on the four zero-shot directions and show how a multilingual model trained with small data can provide reasonable results. Furthermore, we investigate how pivoting (i.e using a bridge / pivot language for inference in a source!pivot!target translations) using a multilingual model can be an alternative to enable zero-shot translation in a low resource setting.

pdf bib
Monolingual Embeddings for Low Resourced Neural Machine Translation
Mattia Antonino Di Gangi | Marcello Federico
Proceedings of the 14th International Conference on Spoken Language Translation

Neural machine translation (NMT) is the state of the art for machine translation, and it shows the best performance when there is a considerable amount of data available. When only little data exist for a language pair, the model can not produce good representations for words, particularly for rare words. One common solution consists in reducing data sparsity by segmenting words into sub-words, in order to allow rare words to have shared representations with other words. Taking a different approach, in this paper we present a method to feed an NMT network with word embeddings trained on monolingual data, which are combined with the task-specific embeddings learned at training time. This method can leverage an embedding matrix with a huge number of words, which can therefore extend the word-level vocabulary. Our experiments on two language pairs show good results for the typical low-resourced data scenario (IWSLT in-domain dataset). Our consistent improvements over the baselines represent a positive proof about the possibility to leverage models pre-trained on monolingual data in NMT.

pdf bib
Improving Zero-Shot Translation of Low-Resource Languages
Surafel M. Lakew | Quintino F. Lotito | Matteo Negri | Marco Turchi | Marcello Federico
Proceedings of the 14th International Conference on Spoken Language Translation

Recent work on multilingual neural machine translation reported competitive performance with respect to bilingual models and surprisingly good performance even on (zero-shot) translation directions not observed at training time. We investigate here a zero-shot translation in a particularly low-resource multilingual setting. We propose a simple iterative training procedure that leverages a duality of translations directly generated by the system for the zero-shot directions. The translations produced by the system (sub-optimal since they contain mixed language from the shared vocabulary), are then used together with the original parallel data to feed and iteratively re-train the multilingual network. Over time, this allows the system to learn from its own generated and increasingly better output. Our approach shows to be effective in improving the two zero-shot directions of our multilingual model. In particular, we observed gains of about 9 BLEU points over a baseline multilingual model and up to 2.08 BLEU over a pivoting mechanism using two bilingual models. Further analysis shows that there is also a slight improvement in the non-zero-shot language directions.