Conference on Computational Linguistics and Speech Processing (2021)


pdf (full)
bib (full)
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

pdf bib
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
Lung-Hao Lee | Chia-Hui Chang | Kuan-Yu Chen

pdf bib
A Study on Using Transfer Learning to Improve BERT Model for Emotional Classification of Chinese LyricsBERT Model for Emotional Classification of Chinese Lyrics
Jia-Yi Liao | Ya-Hsuan Lin | Kuan-Cheng Lin | Jia-Wei Chang

The explosive growth of music libraries has made music information retrieval and recommendation a critical issue. Recommendation systems based on music emotion recognition are gradually gaining attention. Most of the studies focus on audio data rather than lyrics to build models of music emotion classification. In addition, because of the richness of English language resources, most of the existing studies are focused on English lyrics but rarely on Chinese. For this reason, We propose an approach that uses the BERT pretraining model and Transfer learning to improve the emotion classification task of Chinese lyrics. The following approaches were used without any specific training for the Chinese lyrics emotional classification task : (a) Using BERT, only can reach 50 % of the classification accuracy. (b) Using BERT with transfer learning of CVAW, CVAP, and CVAT datasets can achieve 71 % classification accuracy.

pdf bib
Nested Named Entity Recognition for Chinese Electronic Health Records with QA-based Sequence LabelingChinese Electronic Health Records with QA-based Sequence Labeling
Yu-Lun Chiang | Chih-Hao Lin | Cheng-Lung Sung | Keh-Yih Su

This study presents a novel QA-based sequence labeling (QASL) approach to naturally tackle both flat and nested Named Entity Recogntion (NER) tasks on a Chinese Electronic Health Records (CEHRs) dataset. This proposed QASL approach parallelly asks a corresponding natural language question for each specific named entity type, and then identifies those associated NEs of the same specified type with the BIO tagging scheme. The associated nested NEs are then formed by overlapping the results of various types. In comparison with those pure sequence-labeling (SL) approaches, since the given question includes significant prior knowledge about the specified entity type and the capability of extracting NEs with different types, the performance for nested NER task is thus improved, obtaining 90.70 % of F1-score. Besides, in comparison with the pure QA-based approach, our proposed approach retains the SL features, which could extract multiple NEs with the same types without knowing the exact number of NEs in the same passage in advance. Eventually, experiments on our CEHR dataset demonstrate that QASL-based models greatly outperform the SL-based models by 6.12 % to 7.14 % of F1-score.

pdf bib
AI Clerk Platform : Information Extraction DIY PlatformAI Clerk Platform : Information Extraction DIY Platform
Ru-Yng Chang | Wen-Lun Chen | Cheng-Ju Kao

Information extraction is a core technology of natural language processing, which extracts some meaningful phrases / clauses from unstructured or semistructured content to a particular topic. It can be said to be the core technology of many language technologies and applications. This paper introduces AI Clerk Platform, which aims to accelerate and improve the entire process and convenience of the development of information extraction tools. AI Clerk Platform provides a friendly and intuitive visualized manual labeling interface, sets suitable semantic label in need, and implements, distributes and controls manual labeling tasks, so that users can complete customized information extraction models without programming and view the automatically predict results of models by three method. AI Clerk Platform further assists in the development of other natural language processing technologies and the derivation of application services.

pdf bib
Speech Emotion Recognition Based on CNN+LSTM ModelCNN+LSTM Model
Wei Mou | Pei-Hsuan Shen | Chu-Yun Chu | Yu-Cheng Chiu | Tsung-Hsien Yang | Ming-Hsiang Su

Due to the popularity of intelligent dialogue assistant services, speech emotion recognition has become more and more important. In the communication between humans and machines, emotion recognition and emotion analysis can enhance the interaction between machines and humans. This study uses the CNN+LSTM model to implement speech emotion recognition (SER) processing and prediction. From the experimental results, it is known that using the CNN+LSTM model achieves better performance than using the traditional NN model.

pdf bib
A Study on Contextualized Language Modeling for Machine Reading Comprehension
Chin-Ying Wu | Yung-Chang Hsu | Berlin Chen

With the recent breakthrough of deep learning technologies, research on machine reading comprehension (MRC) has attracted much attention and found its versatile applications in many use cases. MRC is an important natural language processing (NLP) task aiming to assess the ability of a machine to understand natural language expressions, which is typically operationalized by first asking questions based on a given text paragraph and then receiving machine-generated answers in accordance with the given context paragraph and questions. In this paper, we leverage two novel pretrained language models built on top of Bidirectional Encoder Representations from Transformers (BERT), namely BERT-wwm and MacBERT, to develop effective MRC methods. In addition, we also seek to investigate whether additional incorporation of the categorical information about a context paragraph can benefit MRC or not, which is achieved based on performing context paragraph clustering on the training dataset. On the other hand, an ensemble learning approach is proposed to harness the synergistic power of the aforementioned two BERT-based models so as to further promote MRC performance.

pdf bib
Discussion on domain generalization in the cross-device speaker verification system
Wei-Ting Lin | Yu-Jia Zhang | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan

In this paper, we use domain generalization to improve the performance of the cross-device speaker verification system. Based on a trainable speaker verification system, we use domain generalization algorithms to fine-tune the model parameters. First, we use the VoxCeleb2 dataset to train ECAPA-TDNN as a baseline model. Then, use the CHT-TDSV dataset and the following domain generalization algorithms to fine-tune it : DANN, CDNN, Deep CORAL. Our proposed system tests 10 different scenarios in the NSYSU-TDSV dataset, including a single device and multiple devices. Finally, in the scenario of multiple devices, the best equal error rate decreased from 18.39 in the baseline to 8.84. Successfully achieved cross-device identification on the speaker verification system.

pdf bib
Integrated Semantic and Phonetic Post-correction for Chinese Speech RecognitionChinese Speech Recognition
Yi-Chang Chen | Chun-Yen Cheng | Chien-An Chen | Ming-Chieh Sung | Yi-Ren Yeh

Due to the recent advances of natural language processing, several works have applied the pre-trained masked language model (MLM) of BERT to the post-correction of speech recognition. However, existing pre-trained models only consider the semantic correction while the phonetic features of words is neglected. The semantic-only post-correction will consequently decrease the performance since homophonic errors are fairly common in Chinese ASR. In this paper, we proposed a novel approach to collectively exploit the contextualized representation and the phonetic information between the error and its replacing candidates to alleviate the error rate of Chinese ASR. Our experiment results on real world speech recognition datasets showed that our proposed method has evidently lower CER than the baseline model, which utilized a pre-trained BERT MLM as the corrector.

pdf bib
A Preliminary Study on Environmental Sound Classification Leveraging Large-Scale Pretrained Model and Semi-Supervised Learning
You-Sheng Tsao | Tien-Hong Lo | Jiun-Ting Li | Shi-Yan Weng | Berlin Chen

With the widespread commercialization of smart devices, research on environmental sound classification has gained more and more attention in recent years. In this paper, we set out to make effective use of large-scale audio pretrained model and semi-supervised model training paradigm for environmental sound classification. To this end, an environmental sound classification method is first put forward, whose component model is built on top a large-scale audio pretrained model. Further, to simulate a low-resource sound classification setting where only limited supervised examples are made available, we instantiate the notion of transfer learning with a recently proposed training algorithm (namely, FixMatch) and a data augmentation method (namely, SpecAugment) to achieve the goal of semi-supervised model training. Experiments conducted on bench-mark dataset UrbanSound8 K reveal that our classification method can lead to an accuracy improvement of 2.4 % in relation to a current baseline method.

pdf bib
Mining Commonsense and Domain Knowledge from Math Word Problems
Shih-Hung Tsai | Chao-Chun Liang | Hsin-Min Wang | Keh-Yih Su

Current neural math solvers learn to incorporate commonsense or domain knowledge by utilizing pre-specified constants or formulas. However, as these constants and formulas are mainly human-specified, the generalizability of the solvers is limited. In this paper, we propose to explicitly retrieve the required knowledge from math problemdatasets. In this way, we can determinedly characterize the required knowledge andimprove the explainability of solvers. Our two algorithms take the problem text andthe solution equations as input. Then, they try to deduce the required commonsense and domain knowledge by integrating information from both parts. We construct two math datasets and show the effectiveness of our algorithms that they can retrieve the required knowledge for problem-solving.

pdf bib
A BERT-based Siamese-structured Retrieval ModelBERT-based Siamese-structured Retrieval Model
Hung-Yun Chiang | Kuan-Yu Chen

Due to the development of deep learning, the natural language processing tasks have made great progresses by leveraging the bidirectional encoder representations from Transformers (BERT). The goal of information retrieval is to search the most relevant results for the user’s query from a large set of documents. Although BERT-based retrieval models have shown excellent results in many studies, these models usually suffer from the need for large amounts of computations and/or additional storage spaces. In view of the flaws, a BERT-based Siamese-structured retrieval model (BESS) is proposed in this paper. BESS not only inherits the merits of pre-trained language models, but also can generate extra information to compensate the original query automatically. Besides, the reinforcement learning strategy is introduced to make the model more robust. Accordingly, we evaluate BESS on three public-available corpora, and the experimental results demonstrate the efficiency of the proposed retrieval model.

pdf bib
Using Valence and Arousal-infused Bi-LSTM for Sentiment Analysis in Social Media Product ReviewsBi-LSTM for Sentiment Analysis in Social Media Product Reviews
Yu-Ya Cheng | Wen-Chao Yeh | Yan-Ming Chen | Yung-Chun Chang

With the popularity of the current Internet age, online social platforms have provided a bridge for communication between private companies, public organizations, and the public. The purpose of this research is to understand the user’s experience of the product by analyzing product review data in different fields. We propose a BiLSTM-based neural network which infused rich emotional information. In addition to consider Valence and Arousal which is the smallest morpheme of emotional information, the dependence relationship between texts is also integrated into the deep learning model to analyze the sentiment. The experimental results show that this research can achieve good performance in predicting the vocabulary Valence and Arousal. In addition, the integration of VA and dependency information into the BiLSTM model can have excellent performance for social text sentiment analysis, which verifies that this model is effective in emotion recognition of social medial short text.

pdf bib
Aggregating User-Centric and Post-Centric Sentiments from Social Media for Topical Stance Prediction
Jenq-Haur Wang | Kuan-Ting Chen

Conventional opinion polls were usually conducted via questionnaires or phone interviews, which are time-consuming and error-prone. With the advances in social networking platforms, it’s easier for the general public to express their opinions on popular topics. Given the huge amount of user opinions, it would be useful if we can automatically collect and aggregate the overall topical stance for a specific topic. In this paper, we propose to predict topical stances from social media by concept expansion, sentiment classification, and stance aggregation based on word embeddings. For concept expansion of a given topic, related posts are collected from social media and clustered by word embeddings. Then, major keywords are extracted by word segmentation and named entity recognition methods. For sentiment classification and aggregation, machine learning methods are used to train sentiment lexicon with word embeddings. Then, the sentiment scores from user-centric and post-centric views are aggregated as the total stance on the topic. In the experiments, we evaluated the performance of our proposed approach using social media data from online forums. The experimental results for 2016 Taiwan Presidential Election showed that our proposed method can effectively expand keywords and aggregate topical stances from the public for accurate prediction of election results. The best performance is 0.52 % in terms of mean absolute error (MAE). Further investigation is needed to evaluate the performance of the proposed method in larger scales.

pdf bib
Hidden Advertorial Detection on Social Media in ChineseChinese
Meng-Ching Ho | Ching-Yun Chuang | Yi-Chun Hsu | Yu-Yun Chang

Nowadays, there are a lot of advertisements hiding as normal posts or experience sharing in social media. There is little research of advertorial detection on Mandarin Chinese texts. This paper thus aimed to focus on hidden advertorial detection of online posts in Taiwan Mandarin Chinese. We inspected seven contextual features based on linguistic theories in discourse level. These features can be further grouped into three schemas under the general advertorial writing structure. We further implemented these features to train a multi-task BERT model to detect advertorials. The results suggested that specific linguistic features would help extract advertorials.

pdf bib
Automatic Extraction of English Grammar Pattern Correction RulesEnglish Grammar Pattern Correction Rules
Kuan-Yu Shen | Yi-Chien Lin | Jason S. Chang

We introduce a method for generating error-correction rules for grammar pattern errors in a given annotated learner corpus. In our approach, annotated edits in the learner corpus are converted into edit rules for correcting common writing errors. The method involves automatic extraction of grammar patterns, and automatic alignment of the erroneous patterns and correct patterns. At run-time, grammar patterns are extracted from the grammatically correct sentences, and correction rules are retrieved by aligning the extracted grammar patterns with the erroneous patterns. Using the proposed method, we generate 1,499 high-quality correction rules related to 232 headwords. The method can be used to assist ESL students in avoiding grammatical errors, and aid teachers in correcting students’ essays. Additionally, the method can be used in the compilation of collocation error dictionaries and the construction of grammar error correction systems.

pdf bib
Learning to Find Translation of Grammar Patterns in Parallel Corpus
Kai-Wen Tuan | Yi-Jyun Chen | Yi-Chien Lin | Chun-Ho Kwok | Hai-Lun Tu | Jason S. Chang

We introduce a method for assisting English as Second Language (ESL) learners by providing translations of Collins COBUILD grammar patterns(GP) for a given word. In our approach, bilingual parallel corpus is transformed into bilingual GP pairs aimed at providing native language support for learning word usage through GPs. The method involves automatically parsing sentences to extract GPs, automatically generating translation GP pairs from bilingual sentences, and automatically extracting common bilingual GPs. At run-time, the target word is used for lookup GPs and translations, and the retrieved common GPs and their example sentences are shown to the user. We present a prototype phrase search engine, Linggle GPTrans, that implements the methods to assist ESL learners. Preliminary evaluation on a set of more than 300 GP-translation pairs shows that the methods achieve 91 % accuracy.

pdf bib
SoochowDS at ROCLING-2021 Shared Task : Text Sentiment Analysis Using BERT and LSTMSoochowDS at ROCLING-2021 Shared Task: Text Sentiment Analysis Using BERT and LSTM
Ruei-Cyuan Su | Sig-Seong Chong | Tzu-En Su | Ming-Hsiang Su

In this shared task, this paper proposes a method to combine the BERT-based word vector model and the LSTM prediction model to predict the Valence and Arousal values in the text. Among them, the BERT-based word vector is 768-dimensional, and each word vector in the sentence is sequentially fed to the LSTM model for prediction. The experimental results show that the performance of our proposed method is better than the results of the Lasso Regression model.

pdf bib
NCU-NLP at ROCLING-2021 Shared Task : Using MacBERT Transformers for Dimensional Sentiment AnalysisNCU-NLP at ROCLING-2021 Shared Task: Using MacBERT Transformers for Dimensional Sentiment Analysis
Man-Chen Hung | Chao-Yi Chen | Pin-Jung Chen | Lung-Hao Lee

We use the MacBERT transformers and fine-tune them to ROCLING-2021 shared tasks using the CVAT and CVAS data. We compare the performance of MacBERT with the other two transformers BERT and RoBERTa in the valence and arousal dimensions, respectively. MAE and correlation coefficient (r) were used as evaluation metrics. On ROCLING-2021 test set, our used MacBERT model achieves 0.611 of MAE and 0.904 of r in the valence dimensions ; and 0.938 of MAE and 0.549 of r in the arousal dimension.

pdf bib
ROCLING-2021 Shared Task : Dimensional Sentiment Analysis for Educational TextsROCLING-2021 Shared Task: Dimensional Sentiment Analysis for Educational Texts
Liang-Chih Yu | Jin Wang | Bo Peng | Chu-Ren Huang

This paper presents the ROCLING 2021 shared task on dimensional sentiment analysis for educational texts which seeks to identify a real-value sentiment score of self-evaluation comments written by Chinese students in the both valence and arousal dimensions. Valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and arousal represents the degree of excitement and calm. Of the 7 teams registered for this shared task for two-dimensional sentiment analysis, 6 submitted results. We expected that this evaluation campaign could produce more advanced dimensional sentiment analysis techniques for the educational domain. All data sets with gold standards and scoring script are made publicly available to researchers.