Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

Yang Liu, Tim Paek, Manasi Patwardhan (Editors)

Anthology ID:
New Orleans, Louisiana
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
Yang Liu | Tim Paek | Manasi Patwardhan

pdf bib
Pay-Per-Request Deployment of Neural Network Models Using Serverless Architectures
Zhucheng Tu | Mengping Li | Jimmy Lin

We demonstrate the serverless deployment of neural networks for model inferencing in NLP applications using Amazon’s Lambda service for feedforward evaluation and DynamoDB for storing word embeddings. Our architecture realizes a pay-per-request pricing model, requiring zero ongoing costs for maintaining server instances. All virtual machine management is handled behind the scenes by the cloud provider without any direct developer intervention. We describe a number of techniques that allow efficient use of serverless resources, and evaluations confirm that our design is both scalable and inexpensive.

pdf bib
An automated medical scribe for documenting clinical encounters
Gregory Finley | Erik Edwards | Amanda Robinson | Michael Brenndoerfer | Najmeh Sadoughi | James Fone | Nico Axtmann | Mark Miller | David Suendermann-Oeft

A medical scribe is a clinical professional who charts patientphysician encounters in real time, relieving physicians of most of their administrative burden and substantially increasing productivity and job satisfaction. We present a complete implementation of an automated medical scribe. Our system can serve either as a scalable, standardized, and economical alternative to human scribes ; or as an assistive tool for them, providing a first draft of a report along with a convenient means to modify it. This solution is, to our knowledge, the first automated scribe ever presented and relies upon multiple speech and language technologies, including speaker diarization, medical speech recognition, knowledge extraction, and natural language generation.

pdf bib
CL Scholar : The ACL Anthology Knowledge Graph MinerCL Scholar: The ACL Anthology Knowledge Graph Miner
Mayank Singh | Pradeep Dogga | Sohan Patro | Dhiraj Barnwal | Ritam Dutt | Rajarshi Haldar | Pawan Goyal | Animesh Mukherjee

We present CL Scholar, the ACL Anthology knowledge graph miner to facilitate high-quality search and exploration of current research progress in the computational linguistics community. In contrast to previous works, periodically crawling, indexing and processing of new incoming articles is completely automated in the current system. CL Scholar utilizes both textual and network information for knowledge graph construction. As an additional novel initiative, CL Scholar supports more than 1200 scholarly natural language queries along with standard keyword-based search on constructed knowledge graph. It answers binary, statistical and list based natural language queries. The current system is deployed at. We also provide REST API support along with bulk download facility. Our code and data are available at. We also provide REST API\n support along with bulk download facility. Our code and data are available\n at\n

pdf bib
ClaimRank : Detecting Check-Worthy Claims in Arabic and EnglishClaimRank: Detecting Check-Worthy Claims in Arabic and English
Israa Jaradat | Pepa Gencheva | Alberto Barrón-Cedeño | Lluís Màrquez | Preslav Nakov

We present ClaimRank, an online system for detecting check-worthy claims. While originally trained on political debates, the system can work for any kind of text, e.g., interviews or just regular news articles. Its aim is to facilitate manual fact-checking efforts by prioritizing the claims that fact-checkers should consider first. ClaimRank supports both Arabic and English, it is trained on actual annotations from nine reputable fact-checking organizations (PolitiFact, FactCheck, ABC, CNN, NPR, NYT, Chicago Tribune, The Guardian, and Washington Post), and thus it can mimic the claim selection strategies for each and any of them, as well as for the union of them all.

pdf bib
360 Stance Detection
Sebastian Ruder | John Glover | Afshin Mehrabani | Parsa Ghaffari

The proliferation of fake news and filter bubbles makes it increasingly difficult to form an unbiased, balanced opinion towards a topic. To ameliorate this, we propose 360 Stance Detection, a tool that aggregates news with multiple perspectives on a topic. It presents them on a spectrum ranging from support to opposition, enabling the user to base their opinion on multiple pieces of diverse evidence.

pdf bib
ELISA-EDL : A Cross-lingual Entity Extraction, Linking and Localization SystemELISA-EDL: A Cross-lingual Entity Extraction, Linking and Localization System
Boliang Zhang | Ying Lin | Xiaoman Pan | Di Lu | Jonathan May | Kevin Knight | Heng Ji

We demonstrate ELISA-EDL, a state-of-the-art re-trainable system to extract entity mentions from low-resource languages, link them to external English knowledge bases, and visualize locations related to disaster topics on a world heatmap. We make all of our data sets, resources and system training and testing APIs publicly available for research purpose.

pdf bib
Entity Resolution and Location Disambiguation in the Ancient Hindu Temples Domain using Web DataAncient Hindu Temples Domain using Web Data
Ayush Maheshwari | Vishwajeet Kumar | Ganesh Ramakrishnan | J. Saketha Nath

We present a system for resolving entities and disambiguating locations based on publicly available web data in the domain of ancient Hindu Temples. Scarce, unstructured information poses a challenge to Entity Resolution(ER) and snippet ranking. Additionally, because the same set of entities may be associated with multiple locations, Location Disambiguation(LD) is a problem. The mentions and descriptions of temples exist in the order of hundreds of thousands, with such data generated by various users in various forms such as text (Wikipedia pages), videos (YouTube videos), blogs, etc. We demonstrate an integrated approach using a combination of grammar rules for parsing and unsupervised (clustering) algorithms to resolve entity and locations with high confidence. A demo of our system is accessible at. Our system is open source and available on Our system is\n open source and available on GitHub.\n

pdf bib
Madly Ambiguous : A Game for Learning about Structural Ambiguity and Why It’s Hard for Computers
Ajda Gokcen | Ethan Hill | Michael White

Madly Ambiguous is an open source, online game aimed at teaching audiences of all ages about structural ambiguity and why it’s hard for computers. After a brief introduction to structural ambiguity, users are challenged to complete a sentence in a way that tricks the computer into guessing an incorrect interpretation. Behind the scenes are two different NLP-based methods for classifying the user’s input, one representative of classic rule-based approaches to disambiguation and the other representative of recent neural network approaches. Qualitative feedback from the system’s use in online, classroom, and science museum settings indicates that it is engaging and successful in conveying the intended take home messages. A demo of Madly Ambiguous can be played at.\n

pdf bib
VnCoreNLP : A Vietnamese Natural Language Processing ToolkitVnCoreNLP: A Vietnamese Natural Language Processing Toolkit
Thanh Vu | Dat Quoc Nguyen | Dai Quoc Nguyen | Mark Dras | Mark Johnson

We present an easy-to-use and fast toolkit, namely VnCoreNLPa Java NLP annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language processing (NLP) tasks including word segmentation, part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing, and obtains state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to provide rich linguistic annotations to facilitate research work on Vietnamese NLP. Our VnCoreNLP is open-source and available at :\n

pdf bib
Generating Continuous Representations of Medical Texts
Graham Spinks | Marie-Francine Moens

We present an architecture that generates medical texts while learning an informative, continuous representation with discriminative features. During training the input to the system is a dataset of captions for medical X-Rays. The acquired continuous representations are of particular interest for use in many machine learning techniques where the discrete and high-dimensional nature of textual input is an obstacle. We use an Adversarially Regularized Autoencoder to create realistic text in both an unconditional and conditional setting. We show that this technique is applicable to medical texts which often contain syntactic and domain-specific shorthands. A quantitative evaluation shows that we achieve a lower model perplexity than a traditional LSTM generator.

pdf bib
RiskFinder : A Sentence-level Risk Detector for Financial ReportsRiskFinder: A Sentence-level Risk Detector for Financial Reports
Yu-Wen Liu | Liang-Chih Liu | Chuan-Ju Wang | Ming-Feng Tsai

This paper presents a web-based information system, RiskFinder, for facilitating the analyses of soft and hard information in financial reports. In particular, the system broadens the analyses from the word level to sentence level, which makes the system useful for practitioner communities and unprecedented among financial academics. The proposed system has four main components : 1) a Form 10-K risk-sentiment dataset, consisting of a set of risk-labeled financial sentences and pre-trained sentence embeddings ; 2) metadata, including basic information on each company that published the Form 10-K financial report as well as several relevant financial measures ; 3) an interface that highlights risk-related sentences in the financial reports based on the latest sentence embedding techniques ; 4) a visualization of financial time-series data for a corresponding company. This paper also conducts some case studies to showcase that the system can be of great help in capturing valuable insight within large amounts of textual information. The system is now online available at.\n

pdf bib
SMILEE : Symmetric Multi-modal Interactions with Language-gesture Enabled (AI) EmbodimentSMILEE: Symmetric Multi-modal Interactions with Language-gesture Enabled (AI) Embodiment
Sujeong Kim | David Salter | Luke DeLuccia | Kilho Son | Mohamed R. Amer | Amir Tamrakar

We demonstrate an intelligent conversational agent system designed for advancing human-machine collaborative tasks. The agent is able to interpret a user’s communicative intent from both their verbal utterances and non-verbal behaviors, such as gestures. The agent is also itself able to communicate both with natural language and gestures, through its embodiment as an avatar thus facilitating natural symmetric multi-modal interactions. We demonstrate two intelligent agents with specialized skills in the Blocks World as use-cases of our system.

pdf bib
Decision Conversations Decoded
Léa Deleris | Debasis Ganguly | Killian Levacher | Martin Stephenson | Francesca Bonin

We describe the vision and current version of a Natural Language Processing system aimed at group decision making facilitation. Borrowing from the scientific field of Decision Analysis, its essential role is to identify alternatives and criteria associated with a given decision, to keep track of who proposed them and of the expressed sentiment towards them. Based on this information, the system can help identify agreement and dissent or recommend an alternative. Overall, it seeks to help a group reach a decision in a natural yet auditable fashion.

pdf bib
Sounding Board : A User-Centric and Content-Driven Social Chatbot
Hao Fang | Hao Cheng | Maarten Sap | Elizabeth Clark | Ari Holtzman | Yejin Choi | Noah A. Smith | Mari Ostendorf

We present Sounding Board, a social chatbot that won the 2017 Amazon Alexa Prize. The system architecture consists of several components including spoken language processing, dialogue management, language generation, and content management, with emphasis on user-centric and content-driven design. We also share insights gained from large-scale online logs based on 160,000 conversations with real-world users.