Ranit Aharonov


2020

pdf bib
Out of the Echo Chamber : Detecting Countering Debate SpeechesDetecting Countering Debate Speeches
Matan Orbach | Yonatan Bilu | Assaf Toledo | Dan Lahav | Michal Jacovi | Ranit Aharonov | Noam Slonim
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

An educated and informed consumption of media content has become a challenge in modern times. With the shift from traditional news outlets to social media and similar venues, a major concern is that readers are becoming encapsulated in echo chambers and may fall prey to fake news and disinformation, lacking easy access to dissenting views. We suggest a novel task aiming to alleviate some of these concerns that of detecting articles that most effectively counter the arguments and not just the stance made in a given text. We study this problem in the context of debate speeches. Given such a speech, we aim to identify, from among a set of speeches on the same topic and with an opposing stance, the ones that directly counter it. We provide a large dataset of 3,685 such speeches (in English), annotated for this relation, which hopefully would be of general interest to the NLP community. We explore several algorithms addressing this task, and while some are successful, all fall short of expert human performance, suggesting room for further research. All data collected during this work is freely available for research.

pdf bib
A Survey of the State of Explainable AI for Natural Language ProcessingAI for Natural Language Processing
Marina Danilevsky | Kun Qian | Ranit Aharonov | Yannis Katsis | Ban Kawas | Prithviraj Sen
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Recent years have seen important advances in the quality of state-of-the-art models, but this has come at the expense of models becoming less interpretable. This survey presents an overview of the current state of Explainable AI (XAI), considered within the domain of Natural Language Processing (NLP). We discuss the main categorization of explanations, as well as the various ways explanations can be arrived at and visualized. We detail the operations and explainability techniques currently available for generating explanations for NLP model predictions, to serve as a resource for model developers in the community. Finally, we point out the current gaps and encourage directions for future work in this important research area.

pdf bib
Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains
Eyal Shnarch | Leshem Choshen | Guy Moshkowich | Ranit Aharonov | Noam Slonim
Findings of the Association for Computational Linguistics: EMNLP 2020

Approaching new data can be quite deterrent ; you do not know how your categories of interest are realized in it, commonly, there is no labeled data at hand, and the performance of domain adaptation methods is unsatisfactory. Aiming to assist domain experts in their first steps into a new task over a new corpus, we present an unsupervised approach to reveal complex rules which cluster the unexplored corpus by its prominent categories (or facets). These rules are human-readable, thus providing an important ingredient which has become in short supply lately-explainability. Each rule provides an explanation for the commonality of all the texts it clusters together. The experts can then identify which rules best capture texts of their categories of interest, and utilize them to deepen their understanding of these categories. These rules can also bootstrap the process of data labeling by pointing at a subset of the corpus which is enriched with texts demonstrating the target categories. We present an extensive evaluation of the usefulness of these rules in identifying target categories, as well as a user study which assesses their interpretability.

2019

pdf bib
Automatic Argument Quality Assessment-New Datasets and Methods
Assaf Toledo | Shai Gretz | Edo Cohen-Karlik | Roni Friedman | Elad Venezian | Dan Lahav | Michal Jacovi | Ranit Aharonov | Noam Slonim
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We explore the task of automatic assessment of argument quality. To that end, we actively collected 6.3k arguments, more than a factor of five compared to previously examined data. Each argument was explicitly and carefully annotated for its quality. In addition, 14k pairs of arguments were annotated independently, identifying the higher quality argument in each pair. In spite of the inherent subjective nature of the task, both annotation schemes led to surprisingly consistent results. We release the labeled datasets to the community. Furthermore, we suggest neural methods based on a recently released language model, for argument ranking as well as for argument-pair classification. In the former task, our results are comparable to state-of-the-art ; in the latter task our results significantly outperform earlier methods.

pdf bib
Crowd-sourcing annotation of complex NLU tasks : A case study of argumentative content annotationNLU tasks: A case study of argumentative content annotation
Tamar Lavee | Lili Kotlerman | Matan Orbach | Yonatan Bilu | Michal Jacovi | Ranit Aharonov | Noam Slonim
Proceedings of the First Workshop on Aggregating and Analysing Crowdsourced Annotations for NLP

Recent advancements in machine reading and listening comprehension involve the annotation of long texts. Such tasks are typically time consuming, making crowd-annotations an attractive solution, yet their complexity often makes such a solution unfeasible. In particular, a major concern is that crowd annotators may be tempted to skim through long texts, and answer questions without reading thoroughly. We present a case study of adapting this type of task to the crowd. The task is to identify claims in a several minute long debate speech. We show that sentence-by-sentence annotation does not scale and that labeling only a subset of sentences is insufficient. Instead, we propose a scheme for effectively performing the full, complex task with crowd annotators, allowing the collection of large scale annotated datasets. We believe that the encountered challenges and pitfalls, as well as lessons learned, are relevant in general when collecting data for large scale natural language understanding (NLU) tasks.

pdf bib
Towards Effective Rebuttal : Listening Comprehension Using Corpus-Wide Claim Mining
Tamar Lavee | Matan Orbach | Lili Kotlerman | Yoav Kantor | Shai Gretz | Lena Dankin | Michal Jacovi | Yonatan Bilu | Ranit Aharonov | Noam Slonim
Proceedings of the 6th Workshop on Argument Mining

Engaging in a live debate requires, among other things, the ability to effectively rebut arguments claimed by your opponent. In particular, this requires identifying these arguments. Here, we suggest doing so by automatically mining claims from a corpus of news articles containing billions of sentences, and searching for them in a given speech. This raises the question of whether such claims indeed correspond to those made in spoken speeches. To this end, we collected a large dataset of 400 speeches in English discussing 200 controversial topics, mined claims for each topic, and asked annotators to identify the mined claims mentioned in each speech. Results show that in the vast majority of speeches debaters indeed make use of such claims. In addition, we present several baselines for the automatic detection of mined claims in speeches, forming the basis for future work. All collected data is freely available for research.

pdf bib
Are You Convinced? Choosing the More Convincing Evidence with a Siamese NetworkSiamese Network
Martin Gleize | Eyal Shnarch | Leshem Choshen | Lena Dankin | Guy Moshkowich | Ranit Aharonov | Noam Slonim
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

With the advancement in argument detection, we suggest to pay more attention to the challenging task of identifying the more convincing arguments. Machines capable of responding and interacting with humans in helpful ways have become ubiquitous. We now expect them to discuss with us the more delicate questions in our world, and they should do so armed with effective arguments. But what makes an argument more persuasive? What will convince you? In this paper, we present a new data set, IBM-EviConv, of pairs of evidence labeled for convincingness, designed to be more challenging than existing alternatives. We also propose a Siamese neural network architecture shown to outperform several baselines on both a prior convincingness data set and our own. Finally, we provide insights into our experimental results and the various kinds of argumentative value our method is capable of detecting.

pdf bib
From Surrogacy to Adoption ; From Bitcoin to Cryptocurrency : Debate Topic Expansion
Roy Bar-Haim | Dalia Krieger | Orith Toledo-Ronen | Lilach Edelstein | Yonatan Bilu | Alon Halfon | Yoav Katz | Amir Menczel | Ranit Aharonov | Noam Slonim
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

When debating a controversial topic, it is often desirable to expand the boundaries of discussion. For example, we may consider the pros and cons of possible alternatives to the debate topic, make generalizations, or give specific examples. We introduce the task of Debate Topic Expansion-finding such related topics for a given debate topic, along with a novel annotated dataset for the task. We focus on relations between Wikipedia concepts, and show that they differ from well-studied lexical-semantic relations such as hypernyms, hyponyms and antonyms. We present algorithms for finding both consistent and contrastive expansions and demonstrate their effectiveness empirically. We suggest that debate topic expansion may have various use cases in argumentation mining.

2018

pdf bib
Proceedings of the 5th Workshop on Argument Mining
Noam Slonim | Ranit Aharonov
Proceedings of the 5th Workshop on Argument Mining

pdf bib
Will it Blend? Blending Weak and Strong Labeled Data in a Neural Network for Argumentation Mining
Eyal Shnarch | Carlos Alzate | Lena Dankin | Martin Gleize | Yufang Hou | Leshem Choshen | Ranit Aharonov | Noam Slonim
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

The process of obtaining high quality labeled data for natural language understanding tasks is often slow, error-prone, complicated and expensive. With the vast usage of neural networks, this issue becomes more notorious since these networks require a large amount of labeled data to produce satisfactory results. We propose a methodology to blend high quality but scarce strong labeled data with noisy but abundant weak labeled data during the training of neural networks. Experiments in the context of topic-dependent evidence detection with two forms of weak labeled data show the advantages of the blending scheme. In addition, we provide a manually annotated data set for the task of topic-dependent evidence detection. We believe that blending weak and strong labeled data is a general notion that may be applicable to many language understanding tasks, and can especially assist researchers who wish to train a network but have a small amount of high quality labeled data for their task of interest.

2017

pdf bib
Unsupervised corpuswide claim detection
Ran Levy | Shai Gretz | Benjamin Sznajder | Shay Hummel | Ranit Aharonov | Noam Slonim
Proceedings of the 4th Workshop on Argument Mining

Automatic claim detection is a fundamental argument mining task that aims to automatically mine claims regarding a topic of consideration. Previous works on mining argumentative content have assumed that a set of relevant documents is given in advance. Here, we present a first corpus wide claim detection framework, that can be directly applied to massive corpora. Using simple and intuitive empirical observations, we derive a claim sentence query by which we are able to directly retrieve sentences in which the prior probability to include topic-relevant claims is greatly enhanced. Next, we employ simple heuristics to rank the sentences, leading to an unsupervised corpuswide claim detection system, with precision that outperforms previously reported results on the task of claim detection given relevant documents and labeled data.