Proceedings of the 2nd Clinical Natural Language Processing Workshop

Anna Rumshisky, Kirk Roberts, Steven Bethard, Tristan Naumann (Editors)

Anthology ID:
Minneapolis, Minnesota, USA
ClinicalNLP | NAACL | WS
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the 2nd Clinical Natural Language Processing Workshop
Anna Rumshisky | Kirk Roberts | Steven Bethard | Tristan Naumann

pdf bib
An Analysis of Attention over Clinical Notes for Predictive Tasks
Sarthak Jain | Ramin Mohammadi | Byron C. Wallace

The shift to electronic medical records (EMRs) has engendered research into machine learning and natural language technologies to analyze patient records, and to predict from these clinical outcomes of interest. Two observations motivate our aims here. First, unstructured notes contained within EMR often contain key information, and hence should be exploited by models. Second, while strong predictive performance is important, interpretability of models is perhaps equally so for applications in this domain. Together, these points suggest that neural models for EMR may benefit from incorporation of attention over notes, which one may hope will both yield performance gains and afford transparency in predictions. In this work we perform experiments to explore this question using two EMR corpora and four different predictive tasks, that : (i) inclusion of attention mechanisms is critical for neural encoder modules that operate over notes fields in order to yield competitive performance, but, (ii) unfortunately, while these boost predictive performance, it is decidedly less clear whether they provide meaningful support for predictions.

pdf bib
Hierarchical Nested Named Entity Recognition
Zita Marinho | Afonso Mendes | Sebastião Miranda | David Nogueira

In the medical domain and other scientific areas, it is often important to recognize different levels of hierarchy in mentions, such as those related to specific symptoms or diseases associated with different anatomical regions. Unlike previous approaches, we build a transition-based parser that explicitly models an arbitrary number of hierarchical and nested mentions, and propose a loss that encourages correct predictions of higher-level mentions. We further introduce a set of modifier classes which introduces certain concepts that change the meaning of an entity, such as absence, or uncertainty about a given disease. Our proposed model achieves state-of-the-art results in medical entity recognition datasets, using both nested and hierarchical mentions.

pdf bib
Towards Automatic Generation of Shareable Synthetic Clinical Notes Using Neural Language Models
Oren Melamud | Chaitanya Shivade

Large-scale clinical data is invaluable to driving many computational scientific advances today. However, understandable concerns regarding patient privacy hinder the open dissemination of such data and give rise to suboptimal siloed research. De-identification methods attempt to address these concerns but were shown to be susceptible to adversarial attacks. In this work, we focus on the vast amounts of unstructured natural language data stored in clinical notes and propose to automatically generate synthetic clinical notes that are more amenable to sharing using generative models trained on real de-identified records. To evaluate the merit of such notes, we measure both their privacy preservation properties as well as utility in training clinical NLP models. Experiments using neural language models yield notes whose utility is close to that of the real ones in some clinical NLP tasks, yet leave ample room for future improvements.

pdf bib
A Novel System for Extractive Clinical Note Summarization using EHR DataEHR Data
Jennifer Liang | Ching-Huei Tsou | Ananya Poddar

While much data within a patient’s electronic health record (EHR) is coded, crucial information concerning the patient’s care and management remain buried in unstructured clinical notes, making it difficult and time-consuming for physicians to review during their usual clinical workflow. In this paper, we present our clinical note processing pipeline, which extends beyond basic medical natural language processing (NLP) with concept recognition and relation detection to also include components specific to EHR data, such as structured data associated with the encounter, sentence-level clinical aspects, and structures of the clinical notes. We report on the use of this pipeline in a disease-specific extractive text summarization task on clinical notes, focusing primarily on progress notes by physicians and nurse practitioners. We show how the addition of EHR-specific components to the pipeline resulted in an improvement in our overall system performance and discuss the potential impact of EHR-specific components on other higher-level clinical NLP tasks.

pdf bib
Medical Entity Linking using Triplet Network
Ishani Mondal | Sukannya Purkayastha | Sudeshna Sarkar | Pawan Goyal | Jitesh Pillai | Amitava Bhattacharyya | Mahanandeeshwar Gattu

Entity linking (or Normalization) is an essential task in text mining that maps the entity mentions in the medical text to standard entities in a given Knowledge Base (KB). This task is of great importance in the medical domain. It can also be used for merging different medical and clinical ontologies. In this paper, we center around the problem of disease linking or normalization. This task is executed in two phases : candidate generation and candidate scoring. In this paper, we present an approach to rank the candidate Knowledge Base entries based on their similarity with disease mention. We make use of the Triplet Network for candidate ranking. While the existing methods have used carefully generated sieves and external resources for candidate generation, we introduce a robust and portable candidate generation scheme that does not make use of the hand-crafted rules. Experimental results on the standard benchmark NCBI disease dataset demonstrate that our system outperforms the prior methods by a significant margin.

pdf bib
Extracting Factual Min / Max Age Information from Clinical Trial StudiesMin/Max Age Information from Clinical Trial Studies
Yufang Hou | Debasis Ganguly | Léa Deleris | Francesca Bonin

Population age information is an essential characteristic of clinical trials. In this paper, we focus on extracting minimum and maximum (min / max) age values for the study samples from clinical research articles. Specifically, we investigate the use of a neural network model for question answering to address this information extraction task. The min / max age QA model is trained on the massive structured clinical study records from For each article, based on multiple min and max age values extracted from the QA model, we predict both actual min / max age values for the study samples and filter out non-factual age expressions. Our system improves the results over (i) a passage retrieval based IE system and (ii) a CRF-based system by a large margin when evaluated on an annotated dataset consisting of 50 research papers on smoking cessation.

pdf bib
Distinguishing Clinical Sentiment : The Importance of Domain Adaptation in Psychiatric Patient Health Records
Eben Holderness | Philip Cawkwell | Kirsten Bolton | James Pustejovsky | Mei-Hua Hall

Recently natural language processing (NLP) tools have been developed to identify and extract salient risk indicators in electronic health records (EHRs). Sentiment analysis, although widely used in non-medical areas for improving decision making, has been studied minimally in the clinical setting. In this study, we undertook, to our knowledge, the first domain adaptation of sentiment analysis to psychiatric EHRs by defining psychiatric clinical sentiment, performing an annotation project, and evaluating multiple sentence-level sentiment machine learning (ML) models. Results indicate that off-the-shelf sentiment analysis tools fail in identifying clinically positive or negative polarity, and that the definition of clinical sentiment that we provide is learnable with relatively small amounts of training data. This project is an initial step towards further refining sentiment analysis methods for clinical use. Our long-term objective is to incorporate the results of this project as part of a machine learning model that predicts inpatient readmission risk. We hope that this work will initiate a discussion concerning domain adaptation of sentiment analysis to the clinical setting.

pdf bib
Attention Neural Model for Temporal Relation Extraction
Sijia Liu | Liwei Wang | Vipin Chaudhary | Hongfang Liu

Neural network models have shown promise in the temporal relation extraction task. In this paper, we present the attention based neural network model to extract the containment relations within sentences from clinical narratives. The attention mechanism used on top of GRU model outperforms the existing state-of-the-art neural network models on THYME corpus in intra-sentence temporal relation extraction.