Toshiaki Nakazawa


2021

pdf bib
Modeling Target-side Inflection in Placeholder Translation
Ryokan Ri | Toshiaki Nakazawa | Yoshimasa Tsuruoka
Proceedings of Machine Translation Summit XVIII: Research Track

Placeholder translation systems enable the users to specify how a specific phrase is translated in the output sentence. The system is trained to output special placeholder tokens and the user-specified term is injected into the output through the context-free replacement of the placeholder token. However and this approach could result in ungrammatical sentences because it is often the case that the specified term needs to be inflected according to the context of the output and which is unknown before the translation. To address this problem and we propose a novel method of placeholder translation that can inflect specified terms according to the grammatical construction of the output sentence. We extend the seq2seq architecture with a character-level decoder that takes the lemma of a user-specified term and the words generated from the word-level decoder to output a correct inflected form of the lemma. We evaluate our approach with a Japanese-to-English translation task in the scientific writing domain and and show our model can incorporate specified terms in a correct form more successfully than other comparable models.

pdf bib
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
Toshiaki Nakazawa | Hideki Nakayama | Isao Goto | Hideya Mino | Chenchen Ding | Raj Dabre | Anoop Kunchukuttan | Shohei Higashiyama | Hiroshi Manabe | Win Pa Pa | Shantipriya Parida | Ondřej Bojar | Chenhui Chu | Akiko Eriguchi | Kaori Abe | Yusuke Oda | Katsuhito Sudoh | Sadao Kurohashi | Pushpak Bhattacharyya
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

pdf bib
Zero-pronoun Data Augmentation for Japanese-to-English TranslationJapanese-to-English Translation
Ryokan Ri | Toshiaki Nakazawa | Yoshimasa Tsuruoka
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

For Japanese-to-English translation, zero pronouns in Japanese pose a challenge, since the model needs to infer and produce the corresponding pronoun in the target side of the English sentence. However, although fully resolving zero pronouns often needs discourse context, in some cases, the local context within a sentence gives clues to the inference of the zero pronoun. In this study, we propose a data augmentation method that provides additional training signals for the translation model to learn correlations between local context and zero pronouns. We show that the proposed method significantly improves the accuracy of zero pronoun translation with machine translation experiments in the conversational domain.

2020

pdf bib
Proceedings of the 7th Workshop on Asian Translation
Toshiaki Nakazawa | Hideki Nakayama | Chenchen Ding | Raj Dabre | Anoop Kunchukuttan | Win Pa Pa | Ondřej Bojar | Shantipriya Parida | Isao Goto | Hidaya Mino | Hiroshi Manabe | Katsuhito Sudoh | Sadao Kurohashi | Pushpak Bhattacharyya
Proceedings of the 7th Workshop on Asian Translation

pdf bib
Proceedings of the Fifth Conference on Machine Translation
Loïc Barrault | Ondřej Bojar | Fethi Bougares | Rajen Chatterjee | Marta R. Costa-jussà | Christian Federmann | Mark Fishel | Alexander Fraser | Yvette Graham | Paco Guzman | Barry Haddow | Matthias Huck | Antonio Jimeno Yepes | Philipp Koehn | André Martins | Makoto Morishita | Christof Monz | Masaaki Nagata | Toshiaki Nakazawa | Matteo Negri
Proceedings of the Fifth Conference on Machine Translation

pdf bib
Document-aligned Japanese-English Conversation Parallel CorpusJapanese-English Conversation Parallel Corpus
Matīss Rikters | Ryokan Ri | Tong Li | Toshiaki Nakazawa
Proceedings of the Fifth Conference on Machine Translation

Sentence-level (SL) machine translation (MT) has reached acceptable quality for many high-resourced languages, but not document-level (DL) MT, which is difficult to 1) train with little amount of DL data ; and 2) evaluate, as the main methods and data sets focus on SL evaluation. To address the first issue, we present a document-aligned Japanese-English conversation corpus, including balanced, high-quality business conversation data for tuning and testing. As for the second issue, we manually identify the main areas where SL MT fails to produce adequate translations in lack of context. We then create an evaluation set where these phenomena are annotated to alleviate automatic evaluation of DL systems. We train MT models using our corpus to demonstrate how using context leads to improvements.

2019

pdf bib
Proceedings of the 6th Workshop on Asian Translation
Toshiaki Nakazawa | Chenchen Ding | Raj Dabre | Anoop Kunchukuttan | Nobushige Doi | Yusuke Oda | Ondřej Bojar | Shantipriya Parida | Isao Goto | Hidaya Mino
Proceedings of the 6th Workshop on Asian Translation

pdf bib
Overview of the 6th Workshop on Asian TranslationAsian Translation
Toshiaki Nakazawa | Nobushige Doi | Shohei Higashiyama | Chenchen Ding | Raj Dabre | Hideya Mino | Isao Goto | Win Pa Pa | Anoop Kunchukuttan | Yusuke Oda | Shantipriya Parida | Ondřej Bojar | Sadao Kurohashi
Proceedings of the 6th Workshop on Asian Translation

This paper presents the results of the shared tasks from the 6th workshop on Asian translation (WAT2019) including JaEn, JaZh scientific paper translation subtasks, JaEn, JaKo, JaEn patent translation subtasks, HiEn, MyEn, KmEn, TaEn mixed domain subtasks and RuJa news commentary translation task. For the WAT2019, 25 teams participated in the shared tasks. We also received 10 research paper submissions out of which 61 were accepted. About 400 translation results were submitted to the automatic evaluation server, and selected submis- sions were manually evaluated.

2017

pdf bib
Neural Machine Translation : Basics, Practical Aspects and Recent Trends
Fabien Cromieres | Toshiaki Nakazawa | Raj Dabre
Proceedings of the IJCNLP 2017, Tutorial Abstracts

Machine Translation (MT) is a sub-field of NLP which has experienced a number of paradigm shifts since its inception. Up until 2014, Phrase Based Statistical Machine Translation (PBSMT) approaches used to be the state of the art. In late 2014, Neural Machine Translation (NMT) was introduced and was proven to outperform all PBSMT approaches by a significant margin. Since then, the NMT approaches have undergone several transformations which have pushed the state of the art even further. This tutorial is primarily aimed at researchers who are either interested in or are fairly new to the world of NMT and want to obtain a deep understanding of NMT fundamentals. Because it will also cover the latest developments in NMT, it should also be useful to attendees with some experience in NMT.

pdf bib
Proceedings of the 4th Workshop on Asian Translation (WAT2017)
Toshiaki Nakazawa | Isao Goto
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

pdf bib
Kyoto University Participation to WAT 2017Kyoto University Participation to WAT 2017
Fabien Cromieres | Raj Dabre | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

We describe here our approaches and results on the WAT 2017 shared translation tasks. Following our good results with Neural Machine Translation in the previous shared task, we continue this approach this year, with incremental improvements in models and training methods. We focused on the ASPEC dataset and could improve the state-of-the-art results for Chinese-to-Japanese and Japanese-to-Chinese translations.