Proceedings of the 6th Workshop on Asian Translation

Toshiaki Nakazawa, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Nobushige Doi, Yusuke Oda, Ondřej Bojar, Shantipriya Parida, Isao Goto, Hidaya Mino (Editors)

Anthology ID:
Hong Kong, China
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the 6th Workshop on Asian Translation
Toshiaki Nakazawa | Chenchen Ding | Raj Dabre | Anoop Kunchukuttan | Nobushige Doi | Yusuke Oda | Ondřej Bojar | Shantipriya Parida | Isao Goto | Hidaya Mino

pdf bib
Overview of the 6th Workshop on Asian TranslationAsian Translation
Toshiaki Nakazawa | Nobushige Doi | Shohei Higashiyama | Chenchen Ding | Raj Dabre | Hideya Mino | Isao Goto | Win Pa Pa | Anoop Kunchukuttan | Yusuke Oda | Shantipriya Parida | Ondřej Bojar | Sadao Kurohashi

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including JaEn, JaZh scientific paper translation subtasks, JaEn, JaKo, JaEn patent translation subtasks, HiEn, MyEn, KmEn, TaEn mixed domain subtasks and RuJa news commentary translation task. For the WAT2019, 25 teams participated in the shared tasks. We also received 10 research paper submissions out of which 61 were accepted. About 400 translation results were submitted to the automatic evaluation server, and selected submis- sions were manually evaluated.

pdf bib
Compact and Robust Models for Japanese-English Character-level Machine TranslationJapanese-English Character-level Machine Translation
Jinan Dai | Kazunori Yamaguchi

Character-level translation has been proved to be able to achieve preferable translation quality without explicit segmentation, but training a character-level model needs a lot of hardware resources. In this paper, we introduced two character-level translation models which are mid-gated model and multi-attention model for Japanese-English translation. We showed that the mid-gated model achieved the better performance with respect to BLEU scores. We also showed that a relatively narrow beam of width 4 or 5 was sufficient for the mid-gated model. As for unknown words, we showed that the mid-gated model could somehow translate the one containing Katakana by coining out a close word. We also showed that the model managed to produce tolerable results for heavily noised sentences, even though the model was trained with the dataset without noise.

pdf bib
NICT’s participation to WAT 2019 : Multilingualism and Multi-step Fine-Tuning for Low Resource NMTNICT’s participation to WAT 2019: Multilingualism and Multi-step Fine-Tuning for Low Resource NMT
Raj Dabre | Eiichiro Sumita

In this paper we describe our submissions to WAT 2019 for the following tasks : EnglishTamil translation and RussianJapanese translation. Our team, NICT-5, focused on multilingual domain adaptation and back-translation for RussianJapanese translation and on simple fine-tuning for EnglishTamil translation. We noted that multi-stage fine tuning is essential in leveraging the power of multilingualism for an extremely low-resource language like RussianJapanese. Furthermore, we can improve the performance of such a low-resource language pair by exploiting a small but in-domain monolingual corpus via back-translation. We managed to obtain second rank in both tasks for all translation directions.

pdf bib
LTRC-MT Simple & Effective Hindi-English Neural Machine Translation Systems at WAT 2019LTRC-MT Simple & Effective Hindi-English Neural Machine Translation Systems at WAT 2019
Vikrant Goyal | Dipti Misra Sharma

This paper describes the Neural Machine Translation systems of IIIT-Hyderabad (LTRC-MT) for WAT 2019 Hindi-English shared task. We experimented with both Recurrent Neural Networks & Transformer architectures. We also show the results of our experiments of training NMT models using additional data via backtranslation.

pdf bib
Supervised neural machine translation based on data augmentation and improved training & inference process
Yixuan Tong | Liang Liang | Boyan Liu | Shanshan Jiang | Bin Dong

This is the second time for SRCB to participate in WAT. This paper describes the neural machine translation systems for the shared translation tasks of WAT 2019. We participated in ASPEC tasks and submitted results on English-Japanese, Japanese-English, Chinese-Japanese, and Japanese-Chinese four language pairs. We employed the Transformer model as the baseline and experimented relative position representation, data augmentation, deep layer model, ensemble. Experiments show that all these methods can yield substantial improvements.

pdf bib
NLPRL at WAT2019 : Transformer-based Tamil English Indic Task Neural Machine Translation SystemNLPRL at WAT2019: Transformer-based Tamil – English Indic Task Neural Machine Translation System
Amit Kumar | Anil Kumar Singh

This paper describes the Machine Translation system for Tamil-English Indic Task organized at WAT 2019. We use Transformer- based architecture for Neural Machine Translation.

pdf bib
Idiap NMT System for WAT 2019 Multimodal Translation TaskNMT System for WAT 2019 Multimodal Translation Task
Shantipriya Parida | Ondřej Bojar | Petr Motlicek

This paper describes the Idiap submission to WAT 2019 for the English-Hindi Multi-Modal Translation Task. We have used the state-of-the-art Transformer model and utilized the IITB English-Hindi parallel corpus as an additional data source. Among the different tracks of the multi-modal task, we have participated in the Text-Only track for the evaluation and challenge test sets. Our submission tops in its track among the competitors in terms of both automatic and manual evaluation. Based on automatic scores, our text-only submission also outperforms systems that consider visual information in the multi-modal translation task.

pdf bib
UCSYNLP-Lab Machine Translation Systems for WAT 2019UCSYNLP-Lab Machine Translation Systems for WAT 2019
Yimon ShweSin | Win Pa Pa | KhinMar Soe

This paper describes the UCSYNLP-Lab submission to WAT 2019 for Myanmar-English translation tasks in both direction. We have used the neural machine translation systems with attention model and utilized the UCSY-corpus and ALT corpus. In NMT with attention model, we use the word segmentation level as well as syllable segmentation level. Especially, we made the UCSY-corpus to be cleaned in WAT 2019. Therefore, the UCSY corpus for WAT 2019 is not identical to those used in WAT 2018. Experiments show that the translation systems can produce the substantial improvements.