Rujun Han


pdf bib
Modeling Context in Answer Sentence Selection Systems on a Latency Budget
Rujun Han | Luca Soldaini | Alessandro Moschitti
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Answer Sentence Selection (AS2) is an efficient approach for the design of open-domain Question Answering (QA) systems. In order to achieve low latency, traditional AS2 models score question-answer pairs individually, ignoring any information from the document each potential answer was extracted from. In contrast, more computationally expensive models designed for machine reading comprehension tasks typically receive one or more passages as input, which often results in better accuracy. In this work, we present an approach to efficiently incorporate contextual information in AS2 models. For each answer candidate, we first use unsupervised similarity techniques to extract relevant sentences from its source document, which we then feed into an efficient transformer architecture fine-tuned for AS2. Our best approach, which leverages a multi-way attention architecture to efficiently encode context, improves 6 % to 11 % over non-contextual state of the art in AS2 with minimal impact on system latency. All experiments in this work were conducted in English.

pdf bib
ECONET : Effective Continual Pretraining of Language Models for Event Temporal ReasoningECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning
Rujun Han | Xiang Ren | Nanyun Peng
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

While pre-trained language models (PTLMs) have achieved noticeable success on many NLP tasks, they still struggle for tasks that require event temporal reasoning, which is essential for event-centric applications. We present a continual pre-training approach that equips PTLMs with targeted knowledge about event temporal relations. We design self-supervised learning objectives to recover masked-out event and temporal indicators and to discriminate sentences from their corrupted counterparts (where event or temporal indicators got replaced). By further pre-training a PTLM with these objectives jointly, we reinforce its attention to event and temporal information, yielding enhanced capability on event temporal reasoning. This * * E**ffective * * CON**tinual pre-training framework for * * E**vent * * T**emporal reasoning (ECONET) improves the PTLMs’ fine-tuning performances across five relation extraction and question answering tasks and achieves new or on-par state-of-the-art performances in most of our downstream tasks.

pdf bib
EventPlus : A Temporal Event Understanding PipelineEventPlus: A Temporal Event Understanding Pipeline
Mingyu Derek Ma | Jiao Sun | Mu Yang | Kung-Hsiang Huang | Nuan Wen | Shikhar Singh | Rujun Han | Nanyun Peng
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations

We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction. Event information, especially event temporal knowledge, is a type of common sense knowledge that helps people understand how stories evolve and provides predictive hints for future events. EventPlus as the first comprehensive temporal event understanding pipeline provides a convenient tool for users to quickly obtain annotations about events and their temporal information for any user-provided document. Furthermore, we show EventPlus can be easily adapted to other domains (e.g., biomedical domain). We make EventPlus publicly available to facilitate event-related information extraction and downstream applications.


pdf bib
Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction
Rujun Han | Yichao Zhou | Nanyun Peng
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Extracting event temporal relations is a critical task for information extraction and plays an important role in natural language understanding. Prior systems leverage deep learning and pre-trained language models to improve the performance of the task. However, these systems often suffer from two shortcomings : 1) when performing maximum a posteriori (MAP) inference based on neural models, previous systems only used structured knowledge that is assumed to be absolutely correct, i.e., hard constraints ; 2) biased predictions on dominant temporal relations when training with a limited amount of data. To address these issues, we propose a framework that enhances deep neural network with distributional constraints constructed by probabilistic domain knowledge. We solve the constrained inference problem via Lagrangian Relaxation and apply it to end-to-end event temporal relation extraction tasks. Experimental results show our framework is able to improve the baseline neural network models with strong statistical significance on two widely used datasets in news and clinical domains.


pdf bib
Deep Structured Neural Network for Event Temporal Relation Extraction
Rujun Han | I-Hung Hsu | Mu Yang | Aram Galstyan | Ralph Weischedel | Nanyun Peng
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

We propose a novel deep structured learning framework for event temporal relation extraction. The model consists of 1) a recurrent neural network (RNN) to learn scoring functions for pair-wise relations, and 2) a structured support vector machine (SSVM) to make joint predictions. The neural network automatically learns representations that account for long-term contexts to provide robust features for the structured model, while the SSVM incorporates domain knowledge such as transitive closure of temporal relations as constraints to make better globally consistent decisions. By jointly training the two components, our model combines the benefits of both data-driven learning and knowledge exploitation. Experimental results on three high-quality event temporal relation datasets (TCR, MATRES, and TB-Dense) demonstrate that incorporated with pre-trained contextualized embeddings, the proposed model achieves significantly better performances than the state-of-the-art methods on all three datasets. We also provide thorough ablation studies to investigate our model.


pdf bib
Conditional Word Embedding and Hypothesis Testing via Bayes-by-BackpropBayes-by-Backprop
Rujun Han | Michael Gill | Arthur Spirling | Kyunghyun Cho
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Conventional word embedding models do not leverage information from document meta-data, and they do not model uncertainty. We address these concerns with a model that incorporates document covariates to estimate conditional word embedding distributions. Our model allows for (a) hypothesis tests about the meanings of terms, (b) assessments as to whether a word is near or far from another conditioned on different covariate values, and (c) assessments as to whether estimated differences are statistically significant.