Rui Wang


2022

pdf bib
WikiDiverse A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity TypesWikiDiverse: A Multimodal Entity Linking Dataset with Diversified Contextual Topics and Entity Types
Xuwu Wang | Junfeng Tian | Min Gui | Zhixu Li | Rui Wang | Ming Yan | Lihan Chen | Yanghua Xiao
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Multimodal Entity Linking MEL which aims at linking mentions with multimodal contexts to the referent entities from a knowledge base e.g. Wikipedia is an essential task for many multimodal applications Although much attention has been paid to MEL the shortcomings of existing MEL datasets including limited contextual topics and entity types simplified mention ambiguity and restricted availability have caused great obstacles to the research and application of MEL In this paper we present WikiDiverse a high quality human annotated MEL dataset with diversified contextual topics and entity types from Wikinews which uses Wikipedia as the corresponding knowledge base A well tailored annotation procedure is adopted to ensure the quality of the dataset Based on WikiDiverse a sequence of well designed MEL models with intra modality and inter modality attentions are implemented which utilize the visual information of images more adequately than existing MEL models do Extensive experimental analyses are conducted to investigate the contributions of different modalities in terms of MEL facilitating the future research on this task

pdf bib
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao | Rui Wang | Long Zhou | Chengyi Wang | Shuo Ren | Yu Wu | Shujie Liu | Tom Ko | Qing Li | Yu Zhang | Zhihua Wei | Yao Qian | Jinyu Li | Furu Wei
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning. The SpeechT5 framework consists of a shared encoder-decoder network and six modal-specific (speech/text) pre/post-nets. After preprocessing the input speech/text through the pre-nets, the shared encoder-decoder network models the sequence-to-sequence transformation, and then the post-nets generate the output in the speech/text modality based on the output of the decoder. Leveraging large-scale unlabeled speech and text data, we pre-train SpeechT5 to learn a unified-modal representation, hoping to improve the modeling capability for both speech and text. To align the textual and speech information into this unified semantic space, we propose a cross-modal vector quantization approach that randomly mixes up speech/text states with latent units as the interface between encoder and decoder. Extensive evaluations show the superiority of the proposed SpeechT5 framework on a wide variety of spoken language processing tasks, including automatic speech recognition, speech synthesis, speech translation, voice conversion, speech enhancement, and speaker identification.

2021

pdf bib
DialogueCSE : Dialogue-based Contrastive Learning of Sentence EmbeddingsDialogueCSE: Dialogue-based Contrastive Learning of Sentence Embeddings
Che Liu | Rui Wang | Jinghua Liu | Jian Sun | Fei Huang | Luo Si
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Learning sentence embeddings from dialogues has drawn increasing attention due to its low annotation cost and high domain adaptability. Conventional approaches employ the siamese-network for this task, which obtains the sentence embeddings through modeling the context-response semantic relevance by applying a feed-forward network on top of the sentence encoders. However, as the semantic textual similarity is commonly measured through the element-wise distance metrics (e.g. cosine and L2 distance), such architecture yields a large gap between training and evaluating. In this paper, we propose DialogueCSE, a dialogue-based contrastive learning approach to tackle this issue. DialogueCSE first introduces a novel matching-guided embedding (MGE) mechanism, which generates a context-aware embedding for each candidate response embedding (i.e. the context-free embedding) according to the guidance of the multi-turn context-response matching matrices. Then it pairs each context-aware embedding with its corresponding context-free embedding and finally minimizes the contrastive loss across all pairs. We evaluate our model on three multi-turn dialogue datasets : the Microsoft Dialogue Corpus, the Jing Dong Dialogue Corpus, and the E-commerce Dialogue Corpus. Evaluation results show that our approach significantly outperforms the baselines across all three datasets in terms of MAP and Spearman’s correlation measures, demonstrating its effectiveness. Further quantitative experiments show that our approach achieves better performance when leveraging more dialogue context and remains robust when less training data is provided.

2020

pdf bib
Neural Topic Modeling with Bidirectional Adversarial Training
Rui Wang | Xuemeng Hu | Deyu Zhou | Yulan He | Yuxuan Xiong | Chenchen Ye | Haiyang Xu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recent years have witnessed a surge of interests of using neural topic models for automatic topic extraction from text, since they avoid the complicated mathematical derivations for model inference as in traditional topic models such as Latent Dirichlet Allocation (LDA). However, these models either typically assume improper prior (e.g. Gaussian or Logistic Normal) over latent topic space or could not infer topic distribution for a given document. To address these limitations, we propose a neural topic modeling approach, called Bidirectional Adversarial Topic (BAT) model, which represents the first attempt of applying bidirectional adversarial training for neural topic modeling. The proposed BAT builds a two-way projection between the document-topic distribution and the document-word distribution. It uses a generator to capture the semantic patterns from texts and an encoder for topic inference. Furthermore, to incorporate word relatedness information, the Bidirectional Adversarial Topic model with Gaussian (Gaussian-BAT) is extended from BAT. To verify the effectiveness of BAT and Gaussian-BAT, three benchmark corpora are used in our experiments. The experimental results show that BAT and Gaussian-BAT obtain more coherent topics, outperforming several competitive baselines. Moreover, when performing text clustering based on the extracted topics, our models outperform all the baselines, with more significant improvements achieved by Gaussian-BAT where an increase of near 6 % is observed in accuracy.

pdf bib
Content Word Aware Neural Machine Translation
Kehai Chen | Rui Wang | Masao Utiyama | Eiichiro Sumita
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural machine translation (NMT) encodes the source sentence in a universal way to generate the target sentence word-by-word. However, NMT does not consider the importance of word in the sentence meaning, for example, some words (i.e., content words) express more important meaning than others (i.e., function words). To address this limitation, we first utilize word frequency information to distinguish between content and function words in a sentence, and then design a content word-aware NMT to improve translation performance. Empirical results on the WMT14 English-to-German, WMT14 English-to-French, and WMT17 Chinese-to-English translation tasks show that the proposed methods can significantly improve the performance of Transformer-based NMT.

pdf bib
Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation
Haipeng Sun | Rui Wang | Kehai Chen | Masao Utiyama | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results for several language pairs. However, it can only translate between a single language pair and can not produce translation results for multiple language pairs at the same time. That is, research on multilingual UNMT has been limited. In this paper, we empirically introduce a simple method to translate between thirteen languages using a single encoder and a single decoder, making use of multilingual data to improve UNMT for all language pairs. On the basis of the empirical findings, we propose two knowledge distillation methods to further enhance multilingual UNMT performance. Our experiments on a dataset with English translated to and from twelve other languages (including three language families and six language branches) show remarkable results, surpassing strong unsupervised individual baselines while achieving promising performance between non-English language pairs in zero-shot translation scenarios and alleviating poor performance in low-resource language pairs.

pdf bib
Multi-Domain Dialogue Acts and Response Co-Generation
Kai Wang | Junfeng Tian | Rui Wang | Xiaojun Quan | Jianxing Yu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Generating fluent and informative responses is of critical importance for task-oriented dialogue systems. Existing pipeline approaches generally predict multiple dialogue acts first and use them to assist response generation. There are at least two shortcomings with such approaches. First, the inherent structures of multi-domain dialogue acts are neglected. Second, the semantic associations between acts and responses are not taken into account for response generation. To address these issues, we propose a neural co-generation model that generates dialogue acts and responses concurrently. Unlike those pipeline approaches, our act generation module preserves the semantic structures of multi-domain dialogue acts and our response generation module dynamically attends to different acts as needed. We train the two modules jointly using an uncertainty loss to adjust their task weights adaptively. Extensive experiments are conducted on the large-scale MultiWOZ dataset and the results show that our model achieves very favorable improvement over several state-of-the-art models in both automatic and human evaluations.

pdf bib
High-order Semantic Role Labeling
Zuchao Li | Hai Zhao | Rui Wang | Kevin Parnow
Findings of the Association for Computational Linguistics: EMNLP 2020

Semantic role labeling is primarily used to identify predicates, arguments, and their semantic relationships. Due to the limitations of modeling methods and the conditions of pre-identified predicates, previous work has focused on the relationships between predicates and arguments and the correlations between arguments at most, while the correlations between predicates have been neglected for a long time. High-order features and structure learning were very common in modeling such correlations before the neural network era. In this paper, we introduce a high-order graph structure for the neural semantic role labeling model, which enables the model to explicitly consider not only the isolated predicate-argument pairs but also the interaction between the predicate-argument pairs. Experimental results on 7 languages of the CoNLL-2009 benchmark show that the high-order structural learning techniques are beneficial to the strong performing SRL models and further boost our baseline to achieve new state-of-the-art results.

pdf bib
SJTU-NICT’s Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation TaskSJTU-NICT’s Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
Zuchao Li | Hai Zhao | Rui Wang | Kehai Chen | Masao Utiyama | Eiichiro Sumita
Proceedings of the Fifth Conference on Machine Translation

In this paper, we introduced our joint team SJTU-NICT ‘s participation in the WMT 2020 machine translation shared task. In this shared task, we participated in four translation directions of three language pairs : English-Chinese, English-Polish on supervised machine translation track, German-Upper Sorbian on low-resource and unsupervised machine translation tracks. Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques : document-enhanced NMT, XLM pre-trained language model enhanced NMT, bidirectional translation as a pre-training, reference language based UNMT, data-dependent gaussian prior objective, and BT-BLEU collaborative filtering self-training. We also used the TF-IDF algorithm to filter the training set to obtain a domain more similar set with the test set for finetuning. In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.

2019

pdf bib
NICT’s Unsupervised Neural and Statistical Machine Translation Systems for the WMT19 News Translation TaskNICT’s Unsupervised Neural and Statistical Machine Translation Systems for the WMT19 News Translation Task
Benjamin Marie | Haipeng Sun | Rui Wang | Kehai Chen | Atsushi Fujita | Masao Utiyama | Eiichiro Sumita
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper presents the NICT’s participation in the WMT19 unsupervised news translation task. We participated in the unsupervised translation direction : German-Czech. Our primary submission to the task is the result of a simple combination of our unsupervised neural and statistical machine translation systems. Our system is ranked first for the German-to-Czech translation task, using only the data provided by the organizers (constraint’), according to both BLEU-cased and human evaluation. We also performed contrastive experiments with other language pairs, namely, English-Gujarati and English-Kazakh, to better assess the effectiveness of unsupervised machine translation in for distant language pairs and in truly low-resource conditions.

pdf bib
Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation
Haipeng Sun | Rui Wang | Kehai Chen | Masao Utiyama | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using non-parallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT. That is, the training of UBWE and UNMT are separate. In this paper, we first empirically investigate the relationship between UBWE and UNMT. The empirical findings show that the performance of UNMT is significantly affected by the performance of UBWE. Thus, we propose two methods that train UNMT with UBWE agreement. Empirical results on several language pairs show that the proposed methods significantly outperform conventional UNMT.

pdf bib
Neural Machine Translation with Reordering Embeddings
Kehai Chen | Rui Wang | Masao Utiyama | Eiichiro Sumita
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The reordering model plays an important role in phrase-based statistical machine translation. However, there are few works that exploit the reordering information in neural machine translation. In this paper, we propose a reordering mechanism to learn the reordering embedding of a word based on its contextual information. These learned reordering embeddings are stacked together with self-attention networks to learn sentence representation for machine translation. The reordering mechanism can be easily integrated into both the encoder and the decoder in the Transformer translation system. Experimental results on WMT’14 English-to-German, NIST Chinese-to-English, and WAT Japanese-to-English translation tasks demonstrate that the proposed methods can significantly improve the performance of the Transformer.

pdf bib
Semi-supervised Domain Adaptation for Dependency Parsing
Zhenghua Li | Xue Peng | Min Zhang | Rui Wang | Luo Si
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

During the past decades, due to the lack of sufficient labeled data, most studies on cross-domain parsing focus on unsupervised domain adaptation, assuming there is no target-domain training data. However, unsupervised approaches make limited progress so far due to the intrinsic difficulty of both domain adaptation and parsing. This paper tackles the semi-supervised domain adaptation problem for Chinese dependency parsing, based on two newly-annotated large-scale domain-aware datasets. We propose a simple domain embedding approach to merge the source- and target-domain training data, which is shown to be more effective than both direct corpus concatenation and multi-task learning. In order to utilize unlabeled target-domain data, we employ the recent contextualized word representations and show that a simple fine-tuning procedure can further boost cross-domain parsing accuracy by large margin.

pdf bib
Sentence-Level Agreement for Neural Machine Translation
Mingming Yang | Rui Wang | Kehai Chen | Masao Utiyama | Eiichiro Sumita | Min Zhang | Tiejun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The training objective of neural machine translation (NMT) is to minimize the loss between the words in the translated sentences and those in the references. In NMT, there is a natural correspondence between the source sentence and the target sentence. However, this relationship has only been represented using the entire neural network and the training objective is computed in word-level. In this paper, we propose a sentence-level agreement module to directly minimize the difference between the representation of source and target sentence. The proposed agreement module can be integrated into NMT as an additional training objective function and can also be used to enhance the representation of the source sentences. Empirical results on the NIST Chinese-to-English and WMT English-to-German tasks show the proposed agreement module can significantly improve the NMT performance.

pdf bib
SJTU-NICT at MRP 2019 : Multi-Task Learning for End-to-End Uniform Semantic Graph ParsingSJTU-NICT at MRP 2019: Multi-Task Learning for End-to-End Uniform Semantic Graph Parsing
Zuchao Li | Hai Zhao | Zhuosheng Zhang | Rui Wang | Masao Utiyama | Eiichiro Sumita
Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning

This paper describes our SJTU-NICT’s system for participating in the shared task on Cross-Framework Meaning Representation Parsing (MRP) at the 2019 Conference for Computational Language Learning (CoNLL). Our system uses a graph-based approach to model a variety of semantic graph parsing tasks. Our main contributions in the submitted system are summarized as follows : 1. Our model is fully end-to-end and is capable of being trained only on the given training set which does not rely on any other extra training source including the companion data provided by the organizer ; 2. We extend our graph pruning algorithm to a variety of semantic graphs, solving the problem of excessive semantic graph search space ; 3. We introduce multi-task learning for multiple objectives within the same framework. The evaluation results show that our system achieved second place in the overall F_1 score and achieved the best F_1 score on the DM framework.F_1 score and achieved the best F_1 score on the DM framework.

2018

pdf bib
Dynamic Sentence Sampling for Efficient Training of Neural Machine Translation
Rui Wang | Masao Utiyama | Eiichiro Sumita
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Traditional Neural machine translation (NMT) involves a fixed training procedure where each sentence is sampled once during each epoch. In reality, some sentences are well-learned during the initial few epochs ; however, using this approach, the well-learned sentences would continue to be trained along with those sentences that were not well learned for 10-30 epochs, which results in a wastage of time. Here, we propose an efficient method to dynamically sample the sentences in order to accelerate the NMT training. In this approach, a weight is assigned to each sentence based on the measured difference between the training costs of two iterations. Further, in each epoch, a certain percentage of sentences are dynamically sampled according to their weights. Empirical results based on the NIST Chinese-to-English and the WMT English-to-German tasks show that the proposed method can significantly accelerate the NMT training and improve the NMT performance.

2017

pdf bib
Context-Aware Smoothing for Neural Machine Translation
Kehai Chen | Rui Wang | Masao Utiyama | Eiichiro Sumita | Tiejun Zhao
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information. This means that even if the word is in a different sentence context, it is represented as the fixed vector to learn source representation. Moreover, a large number of Out-Of-Vocabulary (OOV) words, which have different syntax and semantic information, are represented as the same vector representation of unk. To alleviate this problem, we propose a novel context-aware smoothing method to dynamically learn a sentence-specific vector for each word (including OOV words) depending on its local context words in a sentence. The learned context-aware representation is integrated into the NMT to improve the translation performance. Empirical results on NIST Chinese-to-English translation task show that the proposed approach achieves 1.78 BLEU improvements on average over a strong attentional NMT, and outperforms some existing systems.

pdf bib
Sentence Embedding for Neural Machine Translation Domain Adaptation
Rui Wang | Andrew Finch | Masao Utiyama | Eiichiro Sumita
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Although new corpora are becoming increasingly available for machine translation, only those that belong to the same or similar domains are typically able to improve translation performance. Recently Neural Machine Translation (NMT) has become prominent in the field. However, most of the existing domain adaptation methods only focus on phrase-based machine translation. In this paper, we exploit the NMT’s internal embedding of the source sentence and use the sentence embedding similarity to select the sentences which are close to in-domain data. The empirical adaptation results on the IWSLT English-French and NIST Chinese-English tasks show that the proposed methods can substantially improve NMT performance by 2.4-9.0 BLEU points, outperforming the existing state-of-the-art baseline by 2.3-4.5 BLEU points.

pdf bib
Instance Weighting for Neural Machine Translation Domain Adaptation
Rui Wang | Masao Utiyama | Lemao Liu | Kehai Chen | Eiichiro Sumita
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Instance weighting has been widely applied to phrase-based machine translation domain adaptation. However, it is challenging to be applied to Neural Machine Translation (NMT) directly, because NMT is not a linear model. In this paper, two instance weighting technologies, i.e., sentence weighting and domain weighting with a dynamic weight learning strategy, are proposed for NMT domain adaptation. Empirical results on the IWSLT English-German / French tasks show that the proposed methods can substantially improve NMT performance by up to 2.7-6.7 BLEU points, outperforming the existing baselines by up to 1.6-3.6 BLEU points.

pdf bib
Neural Machine Translation with Source Dependency Representation
Kehai Chen | Rui Wang | Masao Utiyama | Lemao Liu | Akihiro Tamura | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its dependency label together. In this paper, we propose a novel NMT with source dependency representation to improve translation performance of NMT, especially long sentences. Empirical results on NIST Chinese-to-English translation task show that our method achieves 1.6 BLEU improvements on average over a strong NMT system.