Martin Popel


2020

pdf bib
Speed-optimized, Compact Student Models that Distill Knowledge from a Larger Teacher Model : the UEDIN-CUNI Submission to the WMT 2020 News Translation TaskUEDIN-CUNI Submission to the WMT 2020 News Translation Task
Ulrich Germann | Roman Grundkiewicz | Martin Popel | Radina Dobreva | Nikolay Bogoychev | Kenneth Heafield
Proceedings of the Fifth Conference on Machine Translation

We describe the joint submission of the University of Edinburgh and Charles University, Prague, to the Czech / English track in the WMT 2020 Shared Task on News Translation. Our fast and compact student models distill knowledge from a larger, slower teacher. They are designed to offer a good trade-off between translation quality and inference efficiency. On the WMT 2020 Czech English test sets, they achieve translation speeds of over 700 whitespace-delimited source words per second on a single CPU thread, thus making neural translation feasible on consumer hardware without a GPU.

pdf bib
CUNI English-Czech and English-Polish Systems in WMT20 : Robust Document-Level TrainingCUNI English-Czech and English-Polish Systems in WMT20: Robust Document-Level Training
Martin Popel
Proceedings of the Fifth Conference on Machine Translation

We describe our two NMT systems submitted to the WMT 2020 shared task in English-Czech and English-Polish news translation. One system is sentence level, translating each sentence independently. The second system is document level, translating multiple sentences, trained on multi-sentence sequences up to 3000 characters long.

2019

pdf bib
English-Czech Systems in WMT19 : Document-Level TransformerEnglish-Czech Systems in WMT19: Document-Level Transformer
Martin Popel | Dominik Macháček | Michal Auersperger | Ondřej Bojar | Pavel Pecina
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe our NMT systems submitted to the WMT19 shared task in EnglishCzech news translation. Our systems are based on the Transformer model implemented in either Tensor2Tensor (T2 T) or Marian framework. We aimed at improving the adequacy and coherence of translated documents by enlarging the context of the source and target. Instead of translating each sentence independently, we split the document into possibly overlapping multi-sentence segments. In case of the T2 T implementation, this document-level-trained system achieves a +0.6 BLEU improvement (p 0.05) relative to the same system applied on isolated sentences. To assess the potential effect document-level models might have on lexical coherence, we performed a semi-automatic analysis, which revealed only a few sentences improved in this aspect. Thus, we can not draw any conclusions from this week evidence.

pdf bib
CUNI System for the WMT19 Robustness TaskCUNI System for the WMT19 Robustness Task
Jindřich Helcl | Jindřich Libovický | Martin Popel
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We present our submission to the WMT19 Robustness Task. Our baseline system is the Charles University (CUNI) Transformer system trained for the WMT18 shared task on News Translation. Quantitative results show that the CUNI Transformer system is already far more robust to noisy input than the LSTM-based baseline provided by the task organizers. We further improved the performance of our model by fine-tuning on the in-domain noisy data without influencing the translation quality on the news domain.

2018

pdf bib
CUNI Transformer Neural MT System for WMT18CUNI Transformer Neural MT System for WMT18
Martin Popel
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We describe our NMT system submitted to the WMT2018 shared task in news translation. Our system is based on the Transformer model (Vaswani et al., 2017). We use an improved technique of backtranslation, where we iterate the process of translating monolingual data in one direction and training an NMT model for the opposite direction using synthetic parallel data. We apply a simple but effective filtering of the synthetic data. We pre-process the input sentences using coreference resolution in order to disambiguate the gender of pro-dropped personal pronouns. Finally, we apply two simple post-processing substitutions on the translated output. Our system is significantly (p 0.05) better than all other English-Czech and Czech-English systems in WMT2018.

2017

pdf bib
CoNLL 2017 Shared Task : Multilingual Parsing from Raw Text to Universal DependenciesCoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
Daniel Zeman | Martin Popel | Milan Straka | Jan Hajič | Joakim Nivre | Filip Ginter | Juhani Luotolahti | Sampo Pyysalo | Slav Petrov | Martin Potthast | Francis Tyers | Elena Badmaeva | Memduh Gokirmak | Anna Nedoluzhko | Silvie Cinková | Jan Hajič jr. | Jaroslava Hlaváčová | Václava Kettnerová | Zdeňka Urešová | Jenna Kanerva | Stina Ojala | Anna Missilä | Christopher D. Manning | Sebastian Schuster | Siva Reddy | Dima Taji | Nizar Habash | Herman Leung | Marie-Catherine de Marneffe | Manuela Sanguinetti | Maria Simi | Hiroshi Kanayama | Valeria de Paiva | Kira Droganova | Héctor Martínez Alonso | Çağrı Çöltekin | Umut Sulubacak | Hans Uszkoreit | Vivien Macketanz | Aljoscha Burchardt | Kim Harris | Katrin Marheinecke | Georg Rehm | Tolga Kayadelen | Mohammed Attia | Ali Elkahky | Zhuoran Yu | Emily Pitler | Saran Lertpradit | Michael Mandl | Jesse Kirchner | Hector Fernandez Alcalde | Jana Strnadová | Esha Banerjee | Ruli Manurung | Antonio Stella | Atsuko Shimada | Sookyoung Kwak | Gustavo Mendonça | Tatiana Lando | Rattima Nitisaroj | Josie Li
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.