Anuj Kumar


2021

pdf bib
El Volumen Louder Por Favor : Code-switching in Task-oriented Semantic Parsing
Arash Einolghozati | Abhinav Arora | Lorena Sainz-Maza Lecanda | Anuj Kumar | Sonal Gupta
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Being able to parse code-switched (CS) utterances, such as Spanish+English or Hindi+English, is essential to democratize task-oriented semantic parsing systems for certain locales. In this work, we focus on Spanglish (Spanish+English) and release a dataset, CSTOP, containing 5800 CS utterances alongside their semantic parses. We examine the CS generalizability of various Cross-lingual (XL) models and exhibit the advantage of pre-trained XL language models when data for only one language is present. As such, we focus on improving the pre-trained models for the case when only English corpus alongside either zero or a few CS training instances are available. We propose two data augmentation methods for the zero-shot and the few-shot settings : fine-tune using translate-and-align and augment using a generation model followed by match-and-filter. Combining the few-shot setting with the above improvements decreases the initial 30-point accuracy gap between the zero-shot and the full-data settings by two thirds.

2020

pdf bib
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI
Tsung-Hsien Wen | Asli Celikyilmaz | Zhou Yu | Alexandros Papangelis | Mihail Eric | Anuj Kumar | Iñigo Casanueva | Rushin Shah
Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

2019

pdf bib
Memory Grounded Conversational Reasoning
Seungwhan Moon | Pararth Shah | Rajen Subba | Anuj Kumar
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

We demonstrate a conversational system which engages the user through a multi-modal, multi-turn dialog over the user’s memories. The system can perform QA over memories by responding to user queries to recall specific attributes and associated media (e.g. photos) of past episodic memories. The system can also make proactive suggestions to surface related events or facts from past memories to make conversations more engaging and natural. To implement such a system, we collect a new corpus of memory grounded conversations, which comprises human-to-human role-playing dialogs given synthetic memory graphs with simulated attributes. Our proof-of-concept system operates on these synthetic memory graphs, however it can be trained and applied to real-world user memory data (e.g. photo albums, etc.) We present the architecture of the proposed conversational system, and example queries that the system supports.

pdf bib
Proceedings of the First Workshop on NLP for Conversational AI
Yun-Nung Chen | Tania Bedrax-Weiss | Dilek Hakkani-Tur | Anuj Kumar | Mike Lewis | Thang-Minh Luong | Pei-Hao Su | Tsung-Hsien Wen
Proceedings of the First Workshop on NLP for Conversational AI

pdf bib
A Tree-to-Sequence Model for Neural NLG in Task-Oriented DialogNLG in Task-Oriented Dialog
Jinfeng Rao | Kartikeya Upasani | Anusha Balakrishnan | Michael White | Anuj Kumar | Rajen Subba
Proceedings of the 12th International Conference on Natural Language Generation

Generating fluent natural language responses from structured semantic representations is a critical step in task-oriented conversational systems. Sequence-to-sequence models on flat meaning representations (MR) have been dominant in this task, for example in the E2E NLG Challenge. Previous work has shown that a tree-structured MR can improve the model for better discourse-level structuring and sentence-level planning. In this work, we propose a tree-to-sequence model that uses a tree-LSTM encoder to leverage the tree structures in the input MR, and further enhance the decoding by a structure-enhanced attention mechanism. In addition, we explore combining these enhancements with constrained decoding to improve semantic correctness. Our experiments not only show significant improvements over standard seq2seq baselines, but also is more data-efficient and generalizes better to hard scenarios.

pdf bib
Memory Graph Networks for Explainable Memory-grounded Question Answering
Seungwhan Moon | Pararth Shah | Anuj Kumar | Rajen Subba
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

We introduce Episodic Memory QA, the task of answering personal user questions grounded on memory graph (MG), where episodic memories and related entity nodes are connected via relational edges. We create a new benchmark dataset first by generating synthetic memory graphs with simulated attributes, and by composing 100 K QA pairs for the generated MG with bootstrapped scripts. To address the unique challenges for the proposed task, we propose Memory Graph Networks (MGN), a novel extension of memory networks to enable dynamic expansion of memory slots through graph traversals, thus able to answer queries in which contexts from multiple linked episodes and external knowledge are required. We then propose the Episodic Memory QA Net with multiple module networks to effectively handle various question types. Empirical results show improvement over the QA baselines in top-k answer prediction accuracy in the proposed task. The proposed model also generates a graph walk path and attention vectors for each predicted answer, providing a natural way to explain its QA reasoning.