Mona Diab


2021

pdf bib
Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel DataNMT Models to Translate Low-resource Related Languages without Parallel Data
Wei-Jen Ko | Ahmed El-Kishky | Adithya Renduchintala | Vishrav Chaudhary | Naman Goyal | Francisco Guzmán | Pascale Fung | Philipp Koehn | Mona Diab
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

The scarcity of parallel data is a major obstacle for training high-quality machine translation systems for low-resource languages. Fortunately, some low-resource languages are linguistically related or similar to high-resource languages ; these related languages may share many lexical or syntactic structures. In this work, we exploit this linguistic overlap to facilitate translating to and from a low-resource language with only monolingual data, in addition to any parallel data in the related high-resource language. Our method, NMT-Adapt, combines denoising autoencoding, back-translation and adversarial objectives to utilize monolingual data for low-resource adaptation. We experiment on 7 languages from three different language families and show that our technique significantly improves translation into low-resource language compared to other translation baselines.

pdf bib
Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching
Thamar Solorio | Shuguang Chen | Alan W. Black | Mona Diab | Sunayana Sitaram | Victor Soto | Emre Yilmaz | Anirudh Srinivasan
Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching

2020

pdf bib
FEQA : A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive SummarizationFEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization
Esin Durmus | He He | Mona Diab
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural abstractive summarization models are prone to generate content inconsistent with the source document, i.e. unfaithful. Existing automatic metrics do not capture such mistakes effectively. We tackle the problem of evaluating faithfulness of a generated summary given its source document. We first collected human annotations of faithfulness for outputs from numerous models on two datasets. We find that current models exhibit a trade-off between abstractiveness and faithfulness : outputs with less word overlap with the source document are more likely to be unfaithful. Next, we propose an automatic question answering (QA) based metric for faithfulness, FEQA, which leverages recent advances in reading comprehension. Given question-answer pairs generated from the summary, a QA model extracts answers from the document ; non-matched answers indicate unfaithful information in the summary. Among metrics based on word overlap, embedding similarity, and learned language understanding models, our QA-based metric has significantly higher correlation with human faithfulness scores, especially on highly abstractive summaries.

pdf bib
Proceedings of the The 4th Workshop on Computational Approaches to Code Switching
Thamar Solorio | Monojit Choudhury | Kalika Bali | Sunayana Sitaram | Amitava Das | Mona Diab
Proceedings of the The 4th Workshop on Computational Approaches to Code Switching

pdf bib
Proceedings of the First Workshop on Natural Language Processing for Medical Conversations
Parminder Bhatia | Steven Lin | Rashmi Gangadharaiah | Byron Wallace | Izhak Shafran | Chaitanya Shivade | Nan Du | Mona Diab
Proceedings of the First Workshop on Natural Language Processing for Medical Conversations

2019

pdf bib
CASA-NLU : Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented ChatbotsCASA-NLU: Context-Aware Self-Attentive Natural Language Understanding for Task-Oriented Chatbots
Arshit Gupta | Peng Zhang | Garima Lalwani | Mona Diab
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Natural Language Understanding (NLU) is a core component of dialog systems. It typically involves two tasks-Intent Classification (IC) and Slot Labeling (SL), which are then followed by a dialogue management (DM) component. Such NLU systems cater to utterances in isolation, thus pushing the problem of context management to DM. However, contextual information is critical to the correct prediction of intents in a conversation. Prior work on contextual NLU has been limited in terms of the types of contextual signals used and the understanding of their impact on the model. In this work, we propose a context-aware self-attentive NLU (CASA-NLU) model that uses multiple signals over a variable context window, such as previous intents, slots, dialog acts and utterances, in addition to the current user utterance. CASA-NLU outperforms a recurrent contextual NLU baseline on two conversational datasets, yielding a gain of up to 7 % on the IC task. Moreover, a non-contextual variant of CASA-NLU achieves state-of-the-art performance on standard public datasets-SNIPS and ATIS.

pdf bib
Efficient Sentence Embedding using Discrete Cosine Transform
Nada Almarwani | Hanan Aldarmaki | Mona Diab
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Vector averaging remains one of the most popular sentence embedding methods in spite of its obvious disregard for syntactic structure. While more complex sequential or convolutional networks potentially yield superior classification performance, the improvements in classification accuracy are typically mediocre compared to the simple vector averaging. As an efficient alternative, we propose the use of discrete cosine transform (DCT) to compress word sequences in an order-preserving manner. The lower order DCT coefficients represent the overall feature patterns in sentences, which results in suitable embeddings for tasks that could benefit from syntactic features. Our results in semantic probing tasks demonstrate that DCT embeddings indeed preserve more syntactic information compared with vector averaging. With practically equivalent complexity, the model yields better overall performance in downstream classification tasks that correlate with syntactic features, which illustrates the capacity of DCT to preserve word order information.

pdf bib
Multi-Domain Goal-Oriented Dialogues (MultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue DataMultiDoGO): Strategies toward Curating and Annotating Large Scale Dialogue Data
Denis Peskov | Nancy Clarke | Jason Krone | Brigi Fodor | Yi Zhang | Adel Youssef | Mona Diab
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The need for high-quality, large-scale, goal-oriented dialogue datasets continues to grow as virtual assistants become increasingly wide-spread. However, publicly available datasets useful for this area are limited either in their size, linguistic diversity, domain coverage, or annotation granularity. In this paper, we present strategies toward curating and annotating large scale goal oriented dialogue data. We introduce the MultiDoGO dataset to overcome these limitations. With a total of over 81 K dialogues harvested across six domains, MultiDoGO is over 8 times the size of MultiWOZ, the other largest comparable dialogue dataset currently available to the public. Over 54 K of these harvested conversations are annotated for intent classes and slot labels. We adopt a Wizard-of-Oz approach wherein a crowd-sourced worker (the customer) is paired with a trained annotator (the agent). The data curation process was controlled via biases to ensure a diversity in dialogue flows following variable dialogue policies. We provide distinct class label tags for agents vs. customer utterances, along with applicable slot labels. We also compare and contrast our strategies on annotation granularity, i.e. turn vs. sentence level. Furthermore, we compare and contrast annotations curated by leveraging professional annotators vs the crowd. We believe our strategies for eliciting and annotating such a dialogue dataset scales across modalities and domains and potentially languages in the future. To demonstrate the efficacy of our devised strategies we establish neural baselines for classification on the agent and customer utterances as well as slot labeling for each domain.

pdf bib
Identifying Nuances in Fake News vs. Satire : Using Semantic and Linguistic Cues
Or Levi | Pedram Hosseini | Mona Diab | David Broniatowski
Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda

The blurry line between nefarious fake news and protected-speech satire has been a notorious struggle for social media platforms. Further to the efforts of reducing exposure to misinformation on social media, purveyors of fake news have begun to masquerade as satire sites to avoid being demoted. In this work, we address the challenge of automatically classifying fake news versus satire. Previous work have studied whether fake news and satire can be distinguished based on language differences. Contrary to fake news, satire stories are usually humorous and carry some political or social message. We hypothesize that these nuances could be identified using semantic and linguistic cues. Consequently, we train a machine learning method using semantic representation, with a state-of-the-art contextual language model, and with linguistic features based on textual coherence metrics. Empirical evaluation attests to the merits of our approach compared to the language-based baseline and sheds light on the nuances between fake news and satire. As avenues for future work, we consider studying additional linguistic features related to the humor aspect, and enriching the data with current news events, to help identify a political or social message.

pdf bib
GWU NLP at SemEval-2019 Task 7 : Hybrid Pipeline for Rumour Veracity and Stance Classification on Social MediaGWU NLP at SemEval-2019 Task 7: Hybrid Pipeline for Rumour Veracity and Stance Classification on Social Media
Sardar Hamidian | Mona Diab
Proceedings of the 13th International Workshop on Semantic Evaluation

Social media plays a crucial role as the main resource news for information seekers online. However, the unmoderated feature of social media platforms lead to the emergence and spread of untrustworthy contents which harm individuals or even societies. Most of the current automated approaches for automatically determining the veracity of a rumor are not generalizable for novel emerging topics. This paper describes our hybrid system comprising rules and a machine learning model which makes use of replied tweets to identify the veracity of the source tweet. The proposed system in this paper achieved 0.435 F-Macro in stance classification, and 0.262 F-macro and 0.801 RMSE in rumor verification tasks in Task7 of SemEval 2019.

pdf bib
Context-Aware Cross-Lingual Mapping
Hanan Aldarmaki | Mona Diab
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Cross-lingual word vectors are typically obtained by fitting an orthogonal matrix that maps the entries of a bilingual dictionary from a source to a target vector space. Word vectors, however, are most commonly used for sentence or document-level representations that are calculated as the weighted average of word embeddings. In this paper, we propose an alternative to word-level mapping that better reflects sentence-level cross-lingual similarity. We incorporate context in the transformation matrix by directly mapping the averaged embeddings of aligned sentences in a parallel corpus. We also implement cross-lingual mapping of deep contextualized word embeddings using parallel sentences with word alignments. In our experiments, both approaches resulted in cross-lingual sentence embeddings that outperformed context-independent word mapping in sentence translation retrieval. Furthermore, the sentence-level transformation could be used for word-level mapping without loss in word translation quality.

2018

pdf bib
Emotion Detection and Classification in a Multigenre Corpus with Joint Multi-Task Deep Learning
Shabnam Tafreshi | Mona Diab
Proceedings of the 27th International Conference on Computational Linguistics

Detection and classification of emotion categories expressed by a sentence is a challenging task due to subjectivity of emotion. To date, most of the models are trained and evaluated on single genre and when used to predict emotion in different genre their performance drops by a large margin. To address the issue of robustness, we model the problem within a joint multi-task learning framework. We train this model with a multigenre emotion corpus to predict emotions across various genre. Each genre is represented as a separate task, we use soft parameter shared layers across the various tasks. our experimental results show that this model improves the results across the various genres, compared to a single genre training in the same neural net architecture.

pdf bib
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
Gustavo Aguilar | Fahad AlGhamdi | Victor Soto | Thamar Solorio | Mona Diab | Julia Hirschberg
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching

pdf bib
Team SWEEPer : Joint Sentence Extraction and Fact Checking with Pointer NetworksSWEEPer: Joint Sentence Extraction and Fact Checking with Pointer Networks
Christopher Hidey | Mona Diab
Proceedings of the First Workshop on Fact Extraction and VERification (FEVER)

Many tasks such as question answering and reading comprehension rely on information extracted from unreliable sources. These systems would thus benefit from knowing whether a statement from an unreliable source is correct. We present experiments on the FEVER (Fact Extraction and VERification) task, a shared task that involves selecting sentences from Wikipedia and predicting whether a claim is supported by those sentences, refuted, or there is not enough information. Fact checking is a task that benefits from not only asserting or disputing the veracity of a claim but also finding evidence for that position. As these tasks are dependent on each other, an ideal model would consider the veracity of the claim when finding evidence and also find only the evidence that is relevant. We thus jointly model sentence extraction and verification on the FEVER shared task. Among all participants, we ranked 5th on the blind test set (prior to any additional human evaluation of the evidence).

2017

pdf bib
Transferring Semantic Roles Using Translation and Syntactic Information
Maryam Aminian | Mohammad Sadegh Rasooli | Mona Diab
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Our paper addresses the problem of annotation projection for semantic role labeling for resource-poor languages using supervised annotations from a resource-rich language through parallel data. We propose a transfer method that employs information from source and target syntactic dependencies as well as word alignment density to improve the quality of an iterative bootstrapping method. Our experiments yield a 3.5 absolute labeled F-score improvement over a standard annotation projection method.

pdf bib
Proceedings of the Third Arabic Natural Language Processing Workshop
Nizar Habash | Mona Diab | Kareem Darwish | Wassim El-Hajj | Hend Al-Khalifa | Houda Bouamor | Nadi Tomeh | Mahmoud El-Haj | Wajdi Zaghouani
Proceedings of the Third Arabic Natural Language Processing Workshop

pdf bib
A Layered Language Model based Hybrid Approach to Automatic Full Diacritization of ArabicArabic
Mohamed Al-Badrashiny | Abdelati Hawwari | Mona Diab
Proceedings of the Third Arabic Natural Language Processing Workshop

In this paper we present a system for automatic Arabic text diacritization using three levels of analysis granularity in a layered back off manner. We build and exploit diacritized language models (LM) for each of three different levels of granularity : surface form, morphologically segmented into prefix / stem / suffix, and character level. For each of the passes, we use Viterbi search to pick the most probable diacritization per word in the input. We start with the surface form LM, followed by the morphological level, then finally we leverage the character level LM. Our system outperforms all of the published systems evaluated against the same training and test data. It achieves a 10.87 % WER for complete full diacritization including lexical and syntactic diacritization, and 3.0 % WER for lexical diacritization, ignoring syntactic diacritization.

pdf bib
Arabic Textual Entailment with Word EmbeddingsArabic Textual Entailment with Word Embeddings
Nada Almarwani | Mona Diab
Proceedings of the Third Arabic Natural Language Processing Workshop

Determining the textual entailment between texts is important in many NLP tasks, such as summarization, question answering, and information extraction and retrieval. Various methods have been suggested based on external knowledge sources ; however, such resources are not always available in all languages and their acquisition is typically laborious and very costly. Distributional word representations such as word embeddings learned over large corpora have been shown to capture syntactic and semantic word relationships. Such models have contributed to improving the performance of several NLP tasks. In this paper, we address the problem of textual entailment in Arabic. We employ both traditional features and distributional representations. Crucially, we do not depend on any external resources in the process. Our suggested approach yields state of the art performance on a standard data set, ArbTE, achieving an accuracy of 76.2 % compared to state of the art of 69.3 %.

pdf bib
GW_QA at SemEval-2017 Task 3 : Question Answer Re-ranking on Arabic ForaGW_QA at SemEval-2017 Task 3: Question Answer Re-ranking on Arabic Fora
Nada Almarwani | Mona Diab
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper describes our submission to SemEval-2017 Task 3 Subtask D, Question Answer Ranking in Arabic Community Question Answering. In this work, we applied a supervised machine learning approach to automatically re-rank a set of QA pairs according to their relevance to a given question. We employ features based on latent semantic models, namely WTMF, as well as a set of lexical features based on string lengths and surface level matching. The proposed system ranked first out of 3 submissions, with a MAP score of 61.16 %.