Xiangpeng Wei


pdf bib
Towards Enhancing Faithfulness for Neural Machine Translation
Rongxiang Weng | Heng Yu | Xiangpeng Wei | Weihua Luo
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Neural machine translation (NMT) has achieved great success due to the ability to generate high-quality sentences. Compared with human translations, one of the drawbacks of current NMT is that translations are not usually faithful to the input, e.g., omitting information or generating unrelated fragments, which inevitably decreases the overall quality, especially for human readers. In this paper, we propose a novel training strategy with a multi-task learning paradigm to build a faithfulness enhanced NMT model (named FEnmt). During the NMT training process, we sample a subset from the training set and translate them to get fragments that have been mistranslated. Afterward, the proposed multi-task learning paradigm is employed on both encoder and decoder to guide NMT to correctly translate these fragments. Both automatic and human evaluations verify that our FEnmt could improve translation quality by effectively reducing unfaithful translations.

pdf bib
Multiscale Collaborative Deep Models for Neural Machine Translation
Xiangpeng Wei | Heng Yu | Yue Hu | Yue Zhang | Rongxiang Weng | Weihua Luo
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Recent evidence reveals that Neural Machine Translation (NMT) models with deeper neural networks can be more effective but are difficult to train. In this paper, we present a MultiScale Collaborative (MSC) framework to ease the training of NMT models that are substantially deeper than those used previously. We explicitly boost the gradient back-propagation from top to bottom levels by introducing a block-scale collaboration mechanism into deep NMT models. Then, instead of forcing the whole encoder stack directly learns a desired representation, we let each encoder block learns a fine-grained representation and enhance it by encoding spatial dependencies using a context-scale collaboration. We provide empirical evidence showing that the MSC nets are easy to optimize and can obtain improvements of translation quality from considerably increased depth. On IWSLT translation tasks with three translation directions, our extremely deep models (with 72-layer encoders) surpass strong baselines by +2.2~+3.1 BLEU points. In addition, our deep MSC achieves a BLEU score of 30.56 on WMT14 English-to-German task that significantly outperforms state-of-the-art deep NMT models. We have included the source code in supplementary materials.

pdf bib
Bi-directional CognitiveThinking Network for Machine Reading ComprehensionCognitiveThinking Network for Machine Reading Comprehension
Wei Peng | Yue Hu | Luxi Xing | Yuqiang Xie | Jing Yu | Yajing Sun | Xiangpeng Wei
Proceedings of the 28th International Conference on Computational Linguistics

We propose a novel Bi-directional Cognitive Knowledge Framework (BCKF) for reading comprehension from the perspective of complementary learning systems theory. It aims to simulate two ways of thinking in the brain to answer questions, including reverse thinking and inertial thinking. To validate the effectiveness of our framework, we design a corresponding Bi-directional Cognitive Thinking Network (BCTN) to encode the passage and generate a question (answer) given an answer (question) and decouple the bi-directional knowledge. The model has the ability to reverse reasoning questions which can assist inertial thinking to generate more accurate answers. Competitive improvement is observed in DuReader dataset, confirming our hypothesis that bi-directional knowledge helps the QA task. The novel framework shows an interesting perspective on machine reading comprehension and cognitive science.


pdf bib
Unsupervised Neural Machine Translation with Future Rewarding
Xiangpeng Wei | Yue Hu | Luxi Xing | Li Gao
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

In this paper, we alleviate the local optimality of back-translation by learning a policy (takes the form of an encoder-decoder and is defined by its parameters) with future rewarding under the reinforcement learning framework, which aims to optimize the global word predictions for unsupervised neural machine translation. To this end, we design a novel reward function to characterize high-quality translations from two aspects : n-gram matching and semantic adequacy. The n-gram matching is defined as an alternative for the discrete BLEU metric, and the semantic adequacy is used to measure the adequacy of conveying the meaning of the source sentence to the target. During training, our model strives for earning higher rewards by learning to produce grammatically more accurate and semantically more adequate translations. Besides, a variational inference network (VIN) is proposed to constrain the corresponding sentences in two languages have the same or similar latent semantic code. On the widely used WMT’14 English-French, WMT’16 English-German and NIST Chinese-to-English benchmarks, our models respectively obtain 27.59/27.15, 19.65/23.42 and 22.40 BLEU points without using any labeled data, demonstrating consistent improvements over previous unsupervised NMT models.