The Chinese National Conference on Computational Linguistics (2020)


up

pdf (full)
bib (full)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

pdf bib
Proceedings of the 19th Chinese National Conference on Computational Linguistics
Maosong Sun (孙茂松) | Sujian Li (李素建) | Yue Zhang (张岳) | Yang Liu (刘洋)

pdf bib
A Joint Model for Graph-based Chinese Dependency ParsingChinese Dependency Parsing
Xingchen Li | Mingtong Liu | Yujie Zhang | Jinan Xu | Yufeng Chen

In Chinese dependency parsing, the joint model of word segmentation, POS tagging and dependency parsing has become the mainstream framework because it can eliminate error propagation and share knowledge, where the transition-based model with feature templates maintains the best performance. Recently, the graph-based joint model (Yan et al., 2019) on word segmentation and dependency parsing has achieved better performance, demonstrating the advantages of the graph-based models. However, this work can not provide POS information for downstream tasks, and the POS tagging task was proved to be helpful to the dependency parsing according to the research of the transition-based model. Therefore, we propose a graph-based joint model for Chinese word segmentation, POS tagging and dependency parsing. We designed a charater-level POS tagging task, and then train it jointly with the model of Yan et al. We adopt two methods of joint POS tagging task, one is by sharing parameters, the other is by using tag attention mechanism, which enables the three tasks to better share intermediate information and improve each other’s performance. The experimental results on the Penn Chinese treebank (CTB5) show that our proposed joint model improved by 0.38 % on dependency parsing than the model of Yan et al. Compared with the best transition-based joint model, our model improved by 0.18 %, 0.35 % and 5.99 % respectively in terms of word segmentation, POS tagging and dependency parsing.

pdf bib
Semantic-aware Chinese Zero Pronoun Resolution with Pre-trained Semantic Dependency ParserChinese Zero Pronoun Resolution with Pre-trained Semantic Dependency Parser
Lanqiu Zhang | Zizhuo Shen | Yanqiu Shao

Deep learning-based Chinese zero pronoun resolution model has achieved better performance than traditional machine learning-based model. However, the existing work related to Chinese zero pronoun resolution has not yet well integrated linguistic information into the deep learningbased Chinese zero pronoun resolution model. This paper adopts the idea based on the pre-trained model, and integrates the semantic representations in the pre-trained Chinese semantic dependency graph parser into the Chinese zero pronoun resolution model. The experimental results on OntoNotes-5.0 dataset show that our proposed Chinese zero pronoun resolution model with pretrained Chinese semantic dependency parser improves the F-score by 0.4 % compared with our baseline model, and obtains better results than other deep learning-based Chinese zero pronoun resolution models. In addition, we integrate the BERT representations into our model so that the performance of our model was improved by 0.7 % compared with our baseline model.

pdf bib
Refining Data for Text Generation
Wenyu Guan | Qianying Liu | Tianyi Li | Sujian Li

Recent work on data-to-text generation has made progress under the neural encoder-decoder architectures. However, the data input size is often enormous, while not all data records are important for text generation and inappropriate input may bring noise into the final output. To solve this problem, we propose a two-step approach which first selects and orders the important data records and then generates text from the noise-reduced data. Here we propose a learning to rank model to rank the importance of each record which is supervised by a relation extractor. With the noise-reduced data as input, we implement a text generator which sequentially models the input data records and emits a summary. Experiments on the ROTOWIRE dataset verifies the effectiveness of our proposed method in both performance and efficiency.

pdf bib
Chinese Named Entity Recognition via Adaptive Multi-pass Memory Network with Hierarchical Tagging MechanismChinese Named Entity Recognition via Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism
Pengfei Cao | Yubo Chen | Kang Liu | Jun Zhao

Named entity recognition (NER) aims to identify text spans that mention named entities and classify them into pre-defined categories. For Chinese NER task, most of the existing methods are character-based sequence labeling models and achieve great success. However, these methods usually ignore lexical knowledge, which leads to false prediction of entity boundaries. Moreover, these methods have difficulties in capturing tag dependencies. In this paper, we propose an Adaptive Multi-pass Memory Network with Hierarchical Tagging Mechanism (AMMNHT) to address all above problems. Specifically, to reduce the errors of predicting entity boundaries, we propose an adaptive multi-pass memory network to exploit lexical knowledge. In addition, we propose a hierarchical tagging layer to learn tag dependencies. Experimental results on three widely used Chinese NER datasets demonstrate that our proposed model significantly outperforms other state-of-the-art methods.

pdf bib
Entity Relative Position Representation based Multi-head Selection for Joint Entity and Relation Extraction
Tianyang Zhao | Zhao Yan | Yunbo Cao | Zhoujun Li

Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient in extracting entities and relations simultaneously. However, the method is weak for its limited performance. In this paper, we propose several effective insights to address this problem. First, we propose an entity-specific Relative Position Representation (eRPR) to allow the model to fully leverage the distance information between entities and context tokens. Second, we introduce an auxiliary Global Relation Classification (GRC) to enhance the learning of local contextual features. Moreover, we improve the semantic representation by adopting a pre-trained language model BERT as the feature encoder. Finally, these new keypoints are closely integrated with the multi-head selection framework and optimized jointly. Extensive experiments on two benchmark datasets demonstrate that our approach overwhelmingly outperforms previous works in terms of all evaluation metrics, achieving significant improvements for relation F1 by +2.40 % on CoNLL04 and +1.90 % on ACE05, respectively.

pdf bib
Low-Resource Text Classification via Cross-lingual Language Model Fine-tuning
Xiuhong Li | Zhe Li | Jiabao Sheng | Wushour Slamu

Text classification tends to be difficult when data are inadequate considering the amount of manually labeled text corpora. For low-resource agglutinative languages including Uyghur, Kazakh, and Kyrgyz (UKK languages), in which words are manufactured via stems concatenated with several suffixes and stems are used as the representation of text content, this feature allows infinite derivatives vocabulary that leads to high uncertainty of writing forms and huge redundant features. There are major challenges of low-resource agglutinative text classification the lack of labeled data in a target domain and morphologic diversity of derivations in language structures. It is an effective solution which fine-tuning a pre-trained language model to provide meaningful and favorable-to-use feature extractors for downstream text classification tasks. To this end, we propose a low-resource agglutinative language model fine-tuning AgglutiFiT, specifically, we build a low-noise fine-tuning dataset by morphological analysis and stem extraction, then fine-tune the cross-lingual pre-training model on this dataset. Moreover, we propose an attention-based fine-tuning strategy that better selects relevant semantic and syntactic information from the pre-trained language model and uses those features on downstream text classification tasks. We evaluate our methods on nine Uyghur, Kazakh, and Kyrgyz classification datasets, where they have significantly better performance compared with several strong baselines.

pdf bib
Constructing Uyghur Name Entity Recognition System using Neural Machine Translation Tag ProjectionUyghur Name Entity Recognition System using Neural Machine Translation Tag Projection
Anwar Azmat | Li Xiao | Yang Yating | Dong Rui | Osman Turghun

Although named entity recognition achieved great success by introducing the neural networks, it is challenging to apply these models to low resource languages including Uyghur while it depends on a large amount of annotated training data. Constructing a well-annotated named entity corpus manually is very time-consuming and labor-intensive. Most existing methods based on the parallel corpus combined with the word alignment tools. However, word alignment methods introduce alignment errors inevitably. In this paper, we address this problem by a named entity tag transfer method based on the common neural machine translation. The proposed method marks the entity boundaries in Chinese sentence and translates the sentences to Uyghur by neural machine translation system, hope that neural machine translation will align the source and target entity by the self-attention mechanism. The experimental results show that the Uyghur named entity recognition system trained by the constructed corpus achieve good performance on the test set, with 73.80 % F1 score(3.79 % improvement by baseline)

pdf bib
Mongolian Questions Classification Based on Mulit-Head AttentionMongolian Questions Classification Based on Mulit-Head Attention
Guangyi Wang | Feilong Bao | Weihua Wang

Question classification is a crucial subtask in question answering system. Mongolian is a kind of few resource language. It lacks public labeled corpus. And the complex morphological structure of Mongolian vocabulary makes the data-sparse problem. This paper proposes a classification model, which combines the Bi-LSTM model with the Multi-Head Attention mechanism. The Multi-Head Attention mechanism extracts relevant information from different dimensions and representation subspace. According to the characteristics of Mongolian word-formation, this paper introduces Mongolian morphemes representation in the embedding layer. Morpheme vector focuses on the semantics of the Mongolian word. In this paper, character vector and morpheme vector are concatenated to get word vector, which sends to the Bi-LSTM getting context representation. Finally, the Multi-Head Attention obtains global information for classification. The model experimented on the Mongolian corpus. Experimental results show that our proposed model significantly outperforms baseline systems.

pdf bib
Categorizing Offensive Language in Social Networks : A Chinese Corpus, Systems and an Explainable ToolChinese Corpus, Systems and an Explainable Tool
Xiangru Tang | Xianjun Shen

Recently, more and more data have been generated in the online world, filled with offensive language such as threats, swear words or straightforward insults. It is disgraceful for a progressive society, and then the question arises on how language resources and technologies can cope with this challenge. However, previous work only analyzes the problem as a whole but fails to detect particular types of offensive content in a more fine-grained way, mainly because of the lack of annotated data. In this work, we present a densely annotated data-set COLA

pdf bib
LiveQA : A Question Answering Dataset over Sports LiveLiveQA: A Question Answering Dataset over Sports Live
Liu Qianying | Jiang Sicong | Wang Yizhong | Li Sujian

In this paper, we introduce LiveQA, a new question answering dataset constructed from play-by-play live broadcast. It contains 117k multiple-choice questions written by human commentators for over 1,670 NBA games, which are collected from the Chinese Hupu1 website. Derived from the characteristics of sports games, LiveQA can potentially test the reasoning ability across timeline-based live broadcasts, which is challenging compared to the existing datasets. In LiveQA, the questions require understanding the timeline, tracking events or doing mathematical computations. Our preliminary experiments show that the dataset introduces a challenging problem for question answering models, and a strong baseline model only achieves the accuracy of 53.1 % and can not beat the dominant option rule. We release the code and data of this paper for future research.

pdf bib
CAN-GRU : a Hierarchical Model for Emotion Recognition in DialogueCAN-GRU: a Hierarchical Model for Emotion Recognition in Dialogue
Ting Jiang | Bing Xu | Tiejun Zhao | Sheng Li

Emotion recognition in dialogue systems has gained attention in the field of natural language processing recent years, because it can be applied in opinion mining from public conversational data on social media. In this paper, we propose a hierarchical model to recognize emotions in the dialogue. In the first layer, in order to extract textual features of utterances, we propose a convolutional self-attention network(CAN). Convolution is used to capture n-gram information and attention mechanism is used to obtain the relevant semantic information among words in the utterance. In the second layer, a GRU-based network helps to capture contextual information in the conversation. Furthermore, we discuss the effects of unidirectional and bidirectional networks. We conduct experiments on Friends dataset and EmotionPush dataset. The results show that our proposed model(CAN-GRU) and its variants achieve better performance than baselines.

pdf bib
A Joint Model for Aspect-Category Sentiment Analysis with Shared Sentiment Prediction Layer
Yuncong Li | Zhe Yang | Cunxiang Yin | Xu Pan | Lunan Cui | Qiang Huang | Ting Wei

Aspect-category sentiment analysis (ACSA) aims to predict the aspect categories mentioned in texts and their corresponding sentiment polarities. Some joint models have been proposed to address this task. Given a text, these joint models detect all the aspect categories mentioned in the text and predict the sentiment polarities toward them at once. Although these joint models obtain promising performances, they train separate parameters for each aspect category and therefore suffer from data deficiency of some aspect categories. To solve this problem, we propose a novel joint model which contains a shared sentiment prediction layer. The shared sentiment prediction layer transfers sentiment knowledge between aspect categories and alleviates the problem caused by data deficiency. Experiments conducted on SemEval-2016 Datasets demonstrate the effectiveness of our model.

pdf bib
Clickbait Detection with Style-aware Title Modeling and Co-attention
Chuhan Wu | Fangzhao Wu | Tao Qi | Yongfeng Huang

Clickbait is a form of web content designed to attract attention and entice users to click on specific hyperlinks. The detection of clickbaits is an important task for online platforms to improve the quality of web content and the satisfaction of users. Clickbait detection is typically formed as a binary classification task based on the title and body of a webpage, and existing methods are mainly based on the content of title and the relevance between title and body. However, these methods ignore the stylistic patterns of titles, which can provide important clues on identifying clickbaits. In addition, they do not consider the interactions between the contexts within title and body, which are very important for measuring their relevance for clickbait detection. In this paper, we propose a clickbait detection approach with style-aware title modeling and co-attention. Specifically, we use Transformers to learn content representations of title and body, and respectively compute two content-based clickbait scores for title and body based on their representations. In addition, we propose to use a character-level Transformer to learn a style-aware title representation by capturing the stylistic patterns of title, and we compute a title stylistic score based on this representation. Besides, we propose to use a co-attention network to model the relatedness between the contexts within title and body, and further enhance their representations by encoding the interaction information. We compute a title-body matching score based on the representations of title and body enhanced by their interactions. The final clickbait score is predicted by a weighted summation of the aforementioned four kinds of scores.