Proceedings of the First Workshop on Causal Inference and NLP

Amir Feder, Katherine Keith, Emaad Manzoor, Reid Pryzant, Dhanya Sridhar, Zach Wood-Doughty, Jacob Eisenstein, Justin Grimmer, Roi Reichart, Molly Roberts, Uri Shalit, Brandon Stewart, Victor Veitch, Diyi Yang (Editors)


Anthology ID:
2021.cinlp-1
Month:
November
Year:
2021
Address:
Punta Cana, Dominican Republic
Venues:
CINLP | EMNLP
SIG:
Publisher:
Association for Computational Linguistics
URL:
https://aclanthology.org/2021.cinlp-1
DOI:
Bib Export formats:
BibTeX MODS XML EndNote

pdf bib
Proceedings of the First Workshop on Causal Inference and NLP
Amir Feder | Katherine Keith | Emaad Manzoor | Reid Pryzant | Dhanya Sridhar | Zach Wood-Doughty | Jacob Eisenstein | Justin Grimmer | Roi Reichart | Molly Roberts | Uri Shalit | Brandon Stewart | Victor Veitch | Diyi Yang

pdf bib
Text as Causal Mediators : Research Design for Causal Estimates of Differential Treatment of Social Groups via Language Aspects
Katherine Keith | Douglas Rice | Brendan O’Connor

Using observed language to understand interpersonal interactions is important in high-stakes decision making. We propose a causal research design for observational (non-experimental) data to estimate the natural direct and indirect effects of social group signals (e.g. race or gender) on speakers’ responses with separate aspects of language as causal mediators. We illustrate the promises and challenges of this framework via a theoretical case study of the effect of an advocate’s gender on interruptions from justices during U.S. Supreme Court oral arguments. We also discuss challenges conceptualizing and operationalizing causal variables such as gender and language that comprise of many components, and we articulate technical open challenges such as temporal dependence between language mediators in conversational settings.

pdf bib
Enhancing Model Robustness and Fairness with Causality : A Regularization Approach
Zhao Wang | Kai Shu | Aron Culotta

Recent work has raised concerns on the risk of spurious correlations and unintended biases in statistical machine learning models that threaten model robustness and fairness. In this paper, we propose a simple and intuitive regularization approach to integrate causal knowledge during model training and build a robust and fair model by emphasizing causal features and de-emphasizing spurious features. Specifically, we first manually identify causal and spurious features with principles inspired from the counterfactual framework of causal inference. Then, we propose a regularization approach to penalize causal and spurious features separately. By adjusting the strength of the penalty for each type of feature, we build a predictive model that relies more on causal features and less on non-causal features. We conduct experiments to evaluate model robustness and fairness on three datasets with multiple metrics. Empirical results show that the new models built with causal awareness significantly improve model robustness with respect to counterfactual texts and model fairness with respect to sensitive attributes.

pdf bib
Sensitivity Analysis for Causal Mediation through Text : an Application to Political Polarization
Graham Tierney | Alexander Volfovsky

We introduce a procedure to examine a text-as-mediator problem from a novel randomized experiment that studied the effect of conversations on political polarization. In this randomized experiment, Americans from the Democratic and Republican parties were either randomly paired with one-another to have an anonymous conversation about politics or alternatively not assigned to a conversation change in political polarization over time was measured for all participants. This paper analyzes the text of the conversations to identify potential mediators of depolarization and is faced with a unique challenge, necessitated by the primary research hypothesis, that individuals in the control condition do not have conversations and so lack observed text data. We highlight the importance of using domain knowledge to perform dimension reduction on the text data, and describe a procedure to characterize indirect effects via text when the text is only observed in one arm of the experiment.

pdf bib
It’s quality and quantity : the effect of the amount of comments on online suicidal posts
Daniel Low | Kelly Zuromski | Daniel Kessler | Satrajit S. Ghosh | Matthew K. Nock | Walter Dempsey

Every day, individuals post suicide notes on social media asking for support, resources, and reasons to live. Some posts receive few comments while others receive many. While prior studies have analyzed whether specific responses are more or less helpful, it is not clear if the quantity of comments received is beneficial in reducing symptoms or in keeping the user engaged with the platform and hence with life. In the present study, we create a large dataset of users’ first r / SuicideWatch (SW) posts from Reddit (N=21,274), collect the comments as well as the user’s subsequent posts (N=1,615,699) to determine whether they post in SW again in the future. We use propensity score stratification, a causal inference method for observational data, and estimate whether the amount of comments as a measure of social support increases or decreases the likelihood of posting again on SW. One hypothesis is that receiving more comments may decrease the likelihood of the user posting in SW in the future, either by reducing symptoms or because comments from untrained peers may be harmful. On the contrary, we find that receiving more comments increases the likelihood a user will post in SW again. We discuss how receiving more comments is helpful, not by permanently relieving symptoms since users make another SW post and their second posts have similar mentions of suicidal ideation, but rather by reinforcing users to seek support and remain engaged with the platform.decrease the likelihood of the user posting in SW in the future, either by reducing symptoms or because comments from untrained peers may be harmful. On the contrary, we find that receiving more comments increases the likelihood a user will post in SW again. We discuss how receiving more comments is helpful, not by permanently relieving symptoms since users make another SW post and their second posts have similar mentions of suicidal ideation, but rather by reinforcing users to seek support and remain engaged with the platform. Furthermore, since receiving only 1 comment —the most common case— decreases the likelihood of posting again by 14% on average depending on the time window, it is important to develop systems that encourage more commenting.