Chandra Bhagavatula


2021

pdf bib
DExperts : Decoding-Time Controlled Text Generation with Experts and Anti-ExpertsDExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
Alisa Liu | Maarten Sap | Ximing Lu | Swabha Swayamdipta | Chandra Bhagavatula | Noah A. Smith | Yejin Choi
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Despite recent advances in natural language generation, it remains challenging to control attributes of generated text. We propose DExperts : Decoding-time Experts, a decoding-time method for controlled text generation that combines a pretrained language model with expert LMs and/or anti-expert LMs in a product of experts. Intuitively, under the ensemble, tokens only get high probability if they are considered likely by the experts, and unlikely by the anti-experts. We apply DExperts to language detoxification and sentiment-controlled generation, where we outperform existing controllable generation methods on both automatic and human evaluations. Moreover, because DExperts operates only on the output of the pretrained LM, it is effective with (anti-)experts of smaller size, including when operating on GPT-3. Our work highlights the promise of tuning small LMs on text with (un)desirable attributes for efficient decoding-time steering.

pdf bib
I’m Not Mad : Commonsense Implications of Negation and ContradictionI’m Not Mad”: Commonsense Implications of Negation and Contradiction
Liwei Jiang | Antoine Bosselut | Chandra Bhagavatula | Yejin Choi
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Natural language inference requires reasoning about contradictions, negations, and their commonsense implications. Given a simple premise (e.g., I’m mad at you), humans can reason about the varying shades of contradictory statements ranging from straightforward negations (I’m not mad at you) to commonsense contradictions (I’m happy). Moreover, these negated or contradictory statements shift the commonsense implications of the original premise in interesting and nontrivial ways. For example, while I’m mad implies I’m unhappy about something, negating the premise does not necessarily negate the corresponding commonsense implications. In this paper, we present the first comprehensive study focusing on commonsense implications of negated statements and contradictions. We introduce ANION, a new commonsense knowledge graph with 624 K if-then rules focusing on negated and contradictory events. We then present joint generative and discriminative inference models for this new resource, providing novel empirical insights on how logical negations and commonsense contradictions reshape the commonsense implications of their original premises.

2020

pdf bib
Generative Data Augmentation for Commonsense Reasoning
Yiben Yang | Chaitanya Malaviya | Jared Fernandez | Swabha Swayamdipta | Ronan Le Bras | Ji-Ping Wang | Chandra Bhagavatula | Yejin Choi | Doug Downey
Findings of the Association for Computational Linguistics: EMNLP 2020

Recent advances in commonsense reasoning depend on large-scale human-annotated training sets to achieve peak performance. However, manual curation of training sets is expensive and has been shown to introduce annotation artifacts that neural models can readily exploit and overfit to. We propose a novel generative data augmentation technique, G-DAUGC, that aims to achieve more accurate and robust learning in a low-resource setting. Our approach generates synthetic examples using pretrained language models and selects the most informative and diverse set of examples for data augmentation. On experiments with multiple commonsense reasoning benchmarks, G-DAUGC consistently outperforms existing data augmentation methods based on back-translation, establishing a new state-of-the-art on WinoGrande, CODAH, and CommonsenseQA, as well as enhances out-of-distribution generalization, proving to be robust against adversaries or perturbations. Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.

pdf bib
Thinking Like a Skeptic : Defeasible Inference in Natural Language
Rachel Rudinger | Vered Shwartz | Jena D. Hwang | Chandra Bhagavatula | Maxwell Forbes | Ronan Le Bras | Noah A. Smith | Yejin Choi
Findings of the Association for Computational Linguistics: EMNLP 2020

Defeasible inference is a mode of reasoning in which an inference (X is a bird, therefore X flies) may be weakened or overturned in light of new evidence (X is a penguin). Though long recognized in classical AI and philosophy, defeasible inference has not been extensively studied in the context of contemporary data-driven research on natural language inference and commonsense reasoning. We introduce Defeasible NLI (abbreviated -NLI), a dataset for defeasible inference in natural language. Defeasible NLI contains extensions to three existing inference datasets covering diverse modes of reasoning : common sense, natural language inference, and social norms. From Defeasible NLI, we develop both a classification and generation task for defeasible inference, and demonstrate that the generation task is much more challenging. Despite lagging human performance, however, generative models trained on this data are capable of writing sentences that weaken or strengthen a specified inference up to 68 % of the time.\\delta-NLI), a dataset for defeasible inference in natural language. Defeasible NLI contains extensions to three existing inference datasets covering diverse modes of reasoning: common sense, natural language inference, and social norms. From Defeasible NLI, we develop both a classification and generation task for defeasible inference, and demonstrate that the generation task is much more challenging. Despite lagging human performance, however, generative models trained on this data are capable of writing sentences that weaken or strengthen a specified inference up to 68% of the time.

2017

pdf bib
Semi-supervised sequence tagging with bidirectional language models
Matthew E. Peters | Waleed Ammar | Chandra Bhagavatula | Russell Power
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively little labeled data. In this paper, we demonstrate a general semi-supervised approach for adding pretrained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. We evaluate our model on two standard datasets for named entity recognition (NER) and chunking, and in both cases achieve state of the art results, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.