Bin Li


2020

pdf bib
Proceedings of the CoNLL 2020 Shared Task: Cross-Framework Meaning Representation Parsing
Stephan Oepen | Omri Abend | Lasha Abzianidze | Johan Bos | Jan Hajič | Daniel Hershcovich | Bin Li | Tim O'Gorman | Nianwen Xue | Daniel Zeman
Proceedings of the CoNLL 2020 Shared Task: Cross-Framework Meaning Representation Parsing

2019

pdf bib
Building a Chinese AMR Bank with Concept and Relation AlignmentsChinese AMR Bank with Concept and Relation Alignments
Bin Li | Yuan Wen | Li Song | Weiguang Qu | Nianwen Xue
Linguistic Issues in Language Technology, Volume 18, 2019 - Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing

Abstract Meaning Representation (AMR) is a meaning representation framework in which the meaning of a full sentence is represented as a single-rooted, acyclic, directed graph. In this article, we describe an on-going project to build a Chinese AMR (CAMR) corpus, which currently includes 10,149 sentences from the newsgroup and weblog portion of the Chinese TreeBank (CTB). We describe the annotation specifications for the CAMR corpus, which follow the annotation principles of English AMR but make adaptations where needed to accommodate the linguistic facts of Chinese. The CAMR specifications also include a systematic treatment of sentence-internal discourse relations. One significant change we have made to the AMR annotation methodology is the inclusion of the alignment between word tokens in the sentence and the concepts / relations in the CAMR annotation to make it easier for automatic parsers to model the correspondence between a sentence and its meaning representation. We develop an annotation tool for CAMR, and the inter-agreement as measured by the Smatch score between the two annotators is 0.83, indicating reliable annotation. We also present some quantitative analysis of the CAMR corpus. 46.71 % of the AMRs of the sentences are non-tree graphs. Moreover, the AMR of 88.95 % of the sentences has concepts inferred from the context of the sentence but do not correspond to a specific word.

pdf bib
Ellipsis in Chinese AMR CorpusChinese AMR Corpus
Yihuan Liu | Bin Li | Peiyi Yan | Li Song | Weiguang Qu
Proceedings of the First International Workshop on Designing Meaning Representations

Ellipsis is very common in language. It’s necessary for natural language processing to restore the elided elements in a sentence. However, there’s only a few corpora annotating the ellipsis, which draws back the automatic detection and recovery of the ellipsis. This paper introduces the annotation of ellipsis in Chinese sentences, using a novel graph-based representation Abstract Meaning Representation (AMR), which has a good mechanism to restore the elided elements manually. We annotate 5,000 sentences selected from Chinese TreeBank (CTB). We find that 54.98 % of sentences have ellipses. 92 % of the ellipses are restored by copying the antecedents’ concepts. and 12.9 % of them are the new added concepts. In addition, we find that the elided element is a word or phrase in most cases, but sometimes only the head of a phrase or parts of a phrase, which is rather hard for the automatic recovery of ellipsis.

2018

pdf bib
Transition-Based Chinese AMR ParsingChinese AMR Parsing
Chuan Wang | Bin Li | Nianwen Xue
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

This paper presents the first AMR parser built on the Chinese AMR bank. By applying a transition-based AMR parsing framework to Chinese, we first investigate how well the transitions first designed for English AMR parsing generalize to Chinese and provide a comparative analysis between the transitions for English and Chinese. We then perform a detailed error analysis to identify the major challenges in Chinese AMR parsing that we hope will inform future research in this area.