Liang Huang


2021

pdf bib
Proceedings of the Second Workshop on Automatic Simultaneous Translation
Hua Wu | Colin Cherry | Liang Huang | Zhongjun He | Qun Liu | Maha Elbayad | Mark Liberman | Haifeng Wang | Mingbo Ma | Ruiqing Zhang
Proceedings of the Second Workshop on Automatic Simultaneous Translation

2020

pdf bib
Simultaneous Translation
Liang Huang | Colin Cherry | Mingbo Ma | Naveen Arivazhagan | Zhongjun He
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Simultaneous translation, which performs translation concurrently with the source speech, is widely useful in many scenarios such as international conferences, negotiations, press releases, legal proceedings, and medicine. This problem has long been considered one of the hardest problems in AI and one of its holy grails. Recently, with rapid improvements in machine translation, speech recognition, and speech synthesis, there has been exciting progress towards simultaneous translation. This tutorial will focus on the design and evaluation of policies for simultaneous translation, to leave attendees with a deep technical understanding of the history, the recent advances, and the remaining challenges in this field.

pdf bib
Opportunistic Decoding with Timely Correction for Simultaneous Translation
Renjie Zheng | Mingbo Ma | Baigong Zheng | Kaibo Liu | Liang Huang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Simultaneous translation has many important application scenarios and attracts much attention from both academia and industry recently. Most existing frameworks, however, have difficulties in balancing between the translation quality and latency, i.e., the decoding policy is usually either too aggressive or too conservative. We propose an opportunistic decoding technique with timely correction ability, which always (over-)generates a certain mount of extra words at each step to keep the audience on track with the latest information. At the same time, it also corrects, in a timely fashion, the mistakes in the former overgenerated words when observing more source context to ensure high translation quality. Experiments show our technique achieves substantial reduction in latency and up to +3.1 increase in BLEU, with revision rate under 8 % in Chinese-to-English and English-to-Chinese translation.

pdf bib
Proceedings of the First Workshop on Automatic Simultaneous Translation
Hua Wu | Collin Cherry | Liang Huang | Zhongjun He | Mark Liberman | James Cross | Yang Liu
Proceedings of the First Workshop on Automatic Simultaneous Translation

pdf bib
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Renjie Zheng | Mingbo Ma | Baigong Zheng | Kaibo Liu | Jiahong Yuan | Kenneth Church | Liang Huang
Findings of the Association for Computational Linguistics: EMNLP 2020

Simultaneous speech-to-speech translation is an extremely challenging but widely useful scenario that aims to generate target-language speech only a few seconds behind the source-language speech. In addition, we have to continuously translate a speech of multiple sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches will accumulate more and more latencies in later sentences when the speaker talks faster and introduce unnatural pauses into translated speech when the speaker talks slower. To overcome these issues, we propose Self-Adaptive Translation which flexibly adjusts the length of translations to accommodate different source speech rates. At similar levels of translation quality (as measured by BLEU), our method generates more fluent target speech latency than the baseline, in both Zh-En directions.

2019

pdf bib
Simpler and Faster Learning of Adaptive Policies for Simultaneous Translation
Baigong Zheng | Renjie Zheng | Mingbo Ma | Liang Huang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Simultaneous translation is widely useful but remains challenging. Previous work falls into two main categories : (a) fixed-latency policies such as Ma et al. (2019) and (b) adaptive policies such as Gu et al. The former are simple and effective, but have to aggressively predict future content due to diverging source-target word order ; the latter do not anticipate, but suffer from unstable and inefficient training. To combine the merits of both approaches, we propose a simple supervised-learning framework to learn an adaptive policy from oracle READ / WRITE sequences generated from parallel text. At each step, such an oracle sequence chooses to WRITE the next target word if the available source sentence context provides enough information to do so, otherwise READ the next source word. Experiments on German = English show that our method, without retraining the underlying NMT model, can learn flexible policies with better BLEU scores and similar latencies compared to previous work.

2018

pdf bib
Breaking the Beam Search Curse : A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation
Yilin Yang | Liang Huang | Mingbo Ma
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Beam search is widely used in neural machine translation, and usually improves translation quality compared to greedy search. It has been widely observed that, however, beam sizes larger than 5 hurt translation quality. We explain why this happens, and propose several methods to address this problem. Furthermore, we discuss the optimal stopping criteria for these methods. Results show that our hyperparameter-free methods outperform the widely-used hyperparameter-free heuristic of length normalization by +2.0 BLEU, and achieve the best results among all methods on Chinese-to-English translation.

pdf bib
Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation
Renjie Zheng | Mingbo Ma | Liang Huang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Neural text generation, including neural machine translation, image captioning, and summarization, has been quite successful recently. However, during training time, typically only one reference is considered for each example, even though there are often multiple references available, e.g., 4 references in NIST MT evaluations, and 5 references in image captioning data. We first investigate several different ways of utilizing multiple human references during training. But more importantly, we then propose an algorithm to generate exponentially many pseudo-references by first compressing existing human references into lattices and then traversing them to generate new pseudo-references. These approaches lead to substantial improvements over strong baselines in both machine translation (+1.5 BLEU) and image captioning (+3.1 BLEU / +11.7 CIDEr).

pdf bib
Speeding Up Neural Machine Translation Decoding by Cube Pruning
Wen Zhang | Liang Huang | Yang Feng | Lei Shen | Qun Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Although neural machine translation has achieved promising results, it suffers from slow translation speed. The direct consequence is that a trade-off has to be made between translation quality and speed, thus its performance can not come into full play. We apply cube pruning, a popular technique to speed up dynamic programming, into neural machine translation to speed up the translation. To construct the equivalence class, similar target hidden states are combined, leading to less RNN expansion operations on the target side and less softmax operations over the large target vocabulary. The experiments show that, at the same or even better translation quality, our method can translate faster compared with naive beam search by 3.3x on GPUs and 3.5x on CPUs.

pdf bib
Linear-time Constituency Parsing with RNNs and Dynamic ProgrammingRNNs and Dynamic Programming
Juneki Hong | Liang Huang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Recently, span-based constituency parsing has achieved competitive accuracies with extremely simple models by using bidirectional RNNs to model spans. However, the minimal span parser of Stern et al. (2017a) which holds the current state of the art accuracy is a chart parser running in cubic time, O(n^3), which is too slow for longer sentences and for applications beyond sentence boundaries such as end-to-end discourse parsing and joint sentence boundary detection and parsing. We propose a linear-time constituency parser with RNNs and dynamic programming using graph-structured stack and beam search, which runs in time O(n b^2) where b is the beam size. We further speed this up to O(n b log b) by integrating cube pruning. Compared with chart parsing baselines, this linear-time parser is substantially faster for long sentences on the Penn Treebank and orders of magnitude faster for discourse parsing, and achieves the highest F1 accuracy on the Penn Treebank among single model end-to-end systems.O(n^3), which is too slow for longer sentences and for applications beyond sentence boundaries such as end-to-end discourse parsing and joint sentence boundary detection and parsing. We propose a linear-time constituency parser with RNNs and dynamic programming using graph-structured stack and beam search, which runs in time O(n b^2) where b is the beam size. We further speed this up to O(n b log b) by integrating cube pruning. Compared with chart parsing baselines, this linear-time parser is substantially faster for long sentences on the Penn Treebank and orders of magnitude faster for discourse parsing, and achieves the highest F1 accuracy on the Penn Treebank among single model end-to-end systems.

2017

pdf bib
Group Sparse CNNs for Question Classification with Answer SetsCNNs for Question Classification with Answer Sets
Mingbo Ma | Liang Huang | Bing Xiang | Bowen Zhou
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Question classification is an important task with wide applications. However, traditional techniques treat questions as general sentences, ignoring the corresponding answer data. In order to consider answer information into question modeling, we first introduce novel group sparse autoencoders which refine question representation by utilizing group information in the answer set. We then propose novel group sparse CNNs which naturally learn question representation with respect to their answers by implanting group sparse autoencoders into traditional CNNs. The proposed model significantly outperform strong baselines on four datasets.

pdf bib
When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)
Liang Huang | Kai Zhao | Mingbo Ma
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

In neural text generation such as neural machine translation, summarization, and image captioning, beam search is widely used to improve the output text quality. However, in the neural generation setting, hypotheses can finish in different steps, which makes it difficult to decide when to end beam search to ensure optimality. We propose a provably optimal beam search algorithm that will always return the optimal-score complete hypothesis (modulo beam size), and finish as soon as the optimality is established. To counter neural generation’s tendency for shorter hypotheses, we also introduce a bounded length reward mechanism which allows a modified version of our beam search algorithm to remain optimal. Experiments on neural machine translation demonstrate that our principled beam search algorithm leads to improvement in BLEU score over previously proposed alternatives.