Linguistic Issues in Language Technology (2019)


up

bib (full) Linguistic Issues in Language Technology, Volume 17, 2019

pdf bib
Argument alternations in complex predicates : an LFG+glue perspectiveLFG+glue perspective
John J. Lowe

Vaidya et al. (2019) discuss argument alternations in Hindi complex predicates, and propose an analysis within an LTAG framework, comparing this with an LFG analysis of complex predicates. In this paper I clarify the inadequacies in existing LFG analyses of complex predicates, and show how the LFG+glue approach proposed by Lowe (2015) can both address these inadequacies and provide a relatively simple treatment of the phenomena discussed by Vaidya et al.

pdf bib
Complex predicates : Structure, potential structure and underspecification
Stefan Müller

This paper compares a recent TAG-based analysis of complex predicates in Hindi / Urdu with its HPSG analog. It points out that TAG combines actual structure while HPSG (and Categorial Grammar and other valence-based frameworks) specify valence of lexical items and hence potential structure. This makes it possible to have light verbs decide which arguments of embedded heads get realized, somthing that is not possible in TAG. TAG has to retreat to disjunctions instead. While this allows straight-forward analyses of active / passive alternations based on the light verb in valence-based frameworks, such an option does not exist for TAG and it has to be assumed that preverbs come with different sets of arguments.

pdf bib
Complex Predicates and Multidimensionality in Grammar
Miriam Butt

This paper contributes to the on-going discussion of how best to analyze and handle complex predicate formations, commenting in particular on the properties of Hindi N-V complex predicates as set out by Vaidya et al. I highlight features of existing LFG analyses and focus in particular on the modular architecture of LFG, its attendant multidimensional lexicon and the analytic consequences which follow from this. I point out where the previously existing LFG proposals have been misunderstood as viewed from the lens of theories such as LTAG and HPSG, which assume a very different architectural set-up and provide a comparative discussion of the issues.

up

bib (full) Linguistic Issues in Language Technology, Volume 18, 2019 - Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing

pdf bib
Building a Chinese AMR Bank with Concept and Relation AlignmentsChinese AMR Bank with Concept and Relation Alignments
Bin Li | Yuan Wen | Li Song | Weiguang Qu | Nianwen Xue

Abstract Meaning Representation (AMR) is a meaning representation framework in which the meaning of a full sentence is represented as a single-rooted, acyclic, directed graph. In this article, we describe an on-going project to build a Chinese AMR (CAMR) corpus, which currently includes 10,149 sentences from the newsgroup and weblog portion of the Chinese TreeBank (CTB). We describe the annotation specifications for the CAMR corpus, which follow the annotation principles of English AMR but make adaptations where needed to accommodate the linguistic facts of Chinese. The CAMR specifications also include a systematic treatment of sentence-internal discourse relations. One significant change we have made to the AMR annotation methodology is the inclusion of the alignment between word tokens in the sentence and the concepts / relations in the CAMR annotation to make it easier for automatic parsers to model the correspondence between a sentence and its meaning representation. We develop an annotation tool for CAMR, and the inter-agreement as measured by the Smatch score between the two annotators is 0.83, indicating reliable annotation. We also present some quantitative analysis of the CAMR corpus. 46.71 % of the AMRs of the sentences are non-tree graphs. Moreover, the AMR of 88.95 % of the sentences has concepts inferred from the context of the sentence but do not correspond to a specific word.

pdf bib
Probing the nature of an island constraint with a parsed corpus
Yusuke Kubota | Ai Kubota

This paper presents a case study of the use of the NINJAL Parsed Corpus of Modern Japanese (NPCMJ) for syntactic research. NPCMJ is the first phrase structure-based treebank for Japanese that is specifically designed for application in linguistic (in addition to NLP) research. After discussing some basic methodological issues pertaining to the use of treebanks for theoretical linguistics research, we introduce our case study on the status of the Coordinate Structure Constraint (CSC) in Japanese, showing that NPCMJ enables us to easily retrieve examples that support one of the key claims of Kubota and Lee (2015): that the CSC should be viewed as a pragmatic, rather than a syntactic constraint. The corpus-based study we conducted moreover revealed a previously unnoticed tendency that was highly relevant for further clarifying the principles governing the empirical data in question. We conclude the paper by briefly discussing some further methodological issues brought up by our case study pertaining to the relationship between linguistic research and corpus development.