Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

Yuen-Hsien Tseng, Hsin-Hsi Chen, Vincent Ng, Mamoru Komachi (Editors)


Anthology ID:
W18-37
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venues:
ACL | NLP-TEA | WS
SIG:
Publisher:
Association for Computational Linguistics
URL:
https://aclanthology.org/W18-37
DOI:
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
https://aclanthology.org/W18-37.pdf

pdf bib
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Yuen-Hsien Tseng | Hsin-Hsi Chen | Vincent Ng | Mamoru Komachi

pdf bib
Feature Optimization for Predicting Readability of Arabic L1 and L2Arabic L1 and L2
Hind Saddiki | Nizar Habash | Violetta Cavalli-Sforza | Muhamed Al Khalil

Advances in automatic readability assessment can impact the way people consume information in a number of domains. Arabic, being a low-resource and morphologically complex language, presents numerous challenges to the task of automatic readability assessment. In this paper, we present the largest and most in-depth computational readability study for Arabic to date. We study a large set of features with varying depths, from shallow words to syntactic trees, for both L1 and L2 readability tasks. Our best L1 readability accuracy result is 94.8 % (75 % error reduction from a commonly used baseline). The comparable results for L2 are 72.4 % (45 % error reduction). We also demonstrate the added value of leveraging L1 features for L2 readability prediction.

pdf bib
A Tutorial Markov Analysis of Effective Human Tutorial SessionsMarkov Analysis of Effective Human Tutorial Sessions
Nabin Maharjan | Vasile Rus

This paper investigates what differentiates effective tutorial sessions from less effective sessions. Towards this end, we characterize and explore human tutors’ actions in tutorial dialogue sessions by mapping the tutor-tutee interactions, which are streams of dialogue utterances, into streams of actions, based on the language-as-action theory. Next, we use human expert judgment measures, evidence of learning (EL) and evidence of soundness (ES), to identify effective and ineffective sessions. We perform sub-sequence pattern mining to identify sub-sequences of dialogue modes that discriminate good sessions from bad sessions. We finally use the results of sub-sequence analysis method to generate a tutorial Markov process for effective tutorial sessions.

pdf bib
Thank Goodness ! A Way to Measure Style in Student Essays
Sandeep Mathias | Pushpak Bhattacharyya

Essays have two major components for scoring-content and style. In this paper, we describe a property of the essay, called goodness, and use it to predict the score given for the style of student essays. We compare our approach to solve this problem with baseline approaches, like language modeling and also a state-of-the-art deep learning system. We show that, despite being quite intuitive, our approach is very powerful in predicting the style of the essays.

pdf bib
Overview of NLPTEA-2018 Share Task Chinese Grammatical Error DiagnosisNLPTEA-2018 Share Task Chinese Grammatical Error Diagnosis
Gaoqi Rao | Qi Gong | Baolin Zhang | Endong Xun

This paper presents the NLPTEA 2018 shared task for Chinese Grammatical Error Diagnosis (CGED) which seeks to identify grammatical error types, their range of occurrence and recommended corrections within sentences written by learners of Chinese as foreign language. We describe the task definition, data preparation, performance metrics, and evaluation results. Of the 20 teams registered for this shared task, 13 teams developed the system and submitted a total of 32 runs. Progress in system performances was obviously, reaching F1 of 36.12 % in position level and 25.27 % in correction level. All data sets with gold standards and scoring scripts are made publicly available to researchers.

pdf bib
A Hybrid System for Chinese Grammatical Error Diagnosis and CorrectionChinese Grammatical Error Diagnosis and Correction
Chen Li | Junpei Zhou | Zuyi Bao | Hengyou Liu | Guangwei Xu | Linlin Li

This paper introduces the DM_NLP team’s system for NLPTEA 2018 shared task of Chinese Grammatical Error Diagnosis (CGED), which can be used to detect and correct grammatical errors in texts written by Chinese as a Foreign Language (CFL) learners. This task aims at not only detecting four types of grammatical errors including redundant words (R), missing words (M), bad word selection (S) and disordered words (W), but also recommending corrections for errors of M and S types. We proposed a hybrid system including four models for this task with two stages : the detection stage and the correction stage. In the detection stage, we first used a BiLSTM-CRF model to tag potential errors by sequence labeling, along with some handcraft features. Then we designed three Grammatical Error Correction (GEC) models to generate corrections, which could help to tune the detection result. In the correction stage, candidates were generated by the three GEC models and then merged to output the final corrections for M and S types. Our system reached the highest precision in the correction subtask, which was the most challenging part of this shared task, and got top 3 on F1 scores for position detection of errors.

pdf bib
Ling@CASS Solution to the NLP-TEA CGED Shared Task 2018CASS Solution to the NLP-TEA CGED Shared Task 2018
Qinan Hu | Yongwei Zhang | Fang Liu | Yueguo Gu

In this study, we employ the sequence to sequence learning to model the task of grammar error correction. The system takes potentially erroneous sentences as inputs, and outputs correct sentences. To breakthrough the bottlenecks of very limited size of manually labeled data, we adopt a semi-supervised approach. Specifically, we adapt correct sentences written by native Chinese speakers to generate pseudo grammatical errors made by learners of Chinese as a second language. We use the pseudo data to pre-train the model, and the CGED data to fine-tune it. Being aware of the significance of precision in a grammar error correction system in real scenarios, we use ensembles to boost precision. When using inputs as simple as Chinese characters, the ensembled system achieves a precision at 86.56 % in the detection of erroneous sentences, and a precision at 51.53 % in the correction of errors of Selection and Missing types.

pdf bib
Chinese Grammatical Error Diagnosis Based on Policy Gradient LSTM ModelChinese Grammatical Error Diagnosis Based on Policy Gradient LSTM Model
Changliang Li | Ji Qi

Chinese Grammatical Error Diagnosis (CGED) is a natural language processing task for the NLPTEA2018 workshop held during ACL2018. The goal of this task is to diagnose Chinese sentences containing four kinds of grammatical errors through the model and find out the sentence errors. Chinese grammatical error diagnosis system is a very important tool, which can help Chinese learners automatically diagnose grammatical errors in many scenarios. However, due to the limitations of the Chinese language’s own characteristics and datasets, the traditional model faces the problem of extreme imbalances in the positive and negative samples and the disappearance of gradients. In this paper, we propose a sequence labeling method based on the Policy Gradient LSTM model and apply it to this task to solve the above problems. The results show that our model can achieve higher precision scores in the case of lower False positive rate (FPR) and it is convenient to optimize the model on-line.

pdf bib
Joint learning of frequency and word embeddings for multilingual readability assessment
Dieu-Thu Le | Cam-Tu Nguyen | Xiaoliang Wang

This paper describes two models that employ word frequency embeddings to deal with the problem of readability assessment in multiple languages. The task is to determine the difficulty level of a given document, i.e., how hard it is for a reader to fully comprehend the text. The proposed models show how frequency information can be integrated to improve the readability assessment. The experimental results testing on both English and Chinese datasets show that the proposed models improve the results notably when comparing to those using only traditional word embeddings.

pdf bib
MULLE : A grammar-based Latin language learning tool to supplement the classroom settingMULLE: A grammar-based Latin language learning tool to supplement the classroom setting
Herbert Lange | Peter Ljunglöf

MULLE is a tool for language learning that focuses on teaching Latin as a foreign language. It is aimed for easy integration into the traditional classroom setting and syllabus, which makes it distinct from other language learning tools that provide standalone learning experience. It uses grammar-based lessons and embraces methods of gamification to improve the learner motivation. The main type of exercise provided by our application is to practice translation, but it is also possible to shift the focus to vocabulary or morphology training.

pdf bib
Textual Features Indicative of Writing Proficiency in Elementary School Spanish DocumentsSpanish Documents
Gemma Bel-Enguix | Diana Dueñas Chávez | Arturo Curiel Díaz

Childhood acquisition of written language is not straightforward. Writing skills evolve differently depending on external factors, such as the conditions in which children practice their productions and the quality of their instructors’ guidance. This can be challenging in low-income areas, where schools may struggle to ensure ideal acquisition conditions. Developing computational tools to support the learning process may counterweight negative environmental influences ; however, few work exists on the use of information technologies to improve childhood literacy. This work centers around the computational study of Spanish word and syllable structure in documents written by 2nd and 3rd year elementary school students. The studied texts were compared against a corpus of short stories aimed at the same age group, so as to observe whether the children tend to produce similar written patterns as the ones they are expected to interpret at their literacy level. The obtained results show some significant differences between the two kinds of texts, pointing towards possible strategies for the implementation of new education software in support of written language acquisition.

pdf bib
A Short Answer Grading System in Chinese by Support Vector ApproachChinese by Support Vector Approach
Shih-Hung Wu | Wen-Feng Shih

In this paper, we report a short answer grading system in Chinese. We build a system based on standard machine learning approaches and test it with translated corpus from two publicly available corpus in English. The experiment results show similar results on two different corpus as in English.

pdf bib
From Fidelity to Fluency : Natural Language Processing for Translator Training
Oi Yee Kwong

This study explores the use of natural language processing techniques to enhance bilingual lexical access beyond simple equivalents, to enable translators to navigate along a wider cross-lingual lexical space and more examples showing different translation strategies, which is essential for them to learn to produce not only faithful but also fluent translations.

pdf bib
Countering Position Bias in Instructor Interventions in MOOC Discussion ForumsMOOC Discussion Forums
Muthu Kumar Chandrasekaran | Min-Yen Kan

We systematically confirm that instructors are strongly influenced by the user interface presentation of Massive Online Open Course (MOOC) discussion forums. In a large scale dataset, we conclusively show that instructor interventions exhibit strong position bias, as measured by the position where the thread appeared on the user interface at the time of intervention. We measure and remove this bias, enabling unbiased statistical modelling and evaluation. We show that our de-biased classifier improves predicting interventions over the state-of-the-art on courses with sufficient number of interventions by 8.2 % in F1 and 24.4 % in recall on average.

pdf bib
Learning to Automatically Generate Fill-In-The-Blank Quizzes
Edison Marrese-Taylor | Ai Nakajima | Yutaka Matsuo | Ono Yuichi

In this paper we formalize the problem automatic fill-in-the-blank question generation using two standard NLP machine learning schemes, proposing concrete deep learning models for each. We present an empirical study based on data obtained from a language learning platform showing that both of our proposed settings offer promising results.

pdf bib
Multilingual Short Text Responses Clustering for Mobile Educational Activities : a Preliminary Exploration
Yuen-Hsien Tseng | Lung-Hao Lee | Yu-Ta Chien | Chun-Yen Chang | Tsung-Yen Li

Text clustering is a powerful technique to detect topics from document corpora, so as to provide information browsing, analysis, and organization. On the other hand, the Instant Response System (IRS) has been widely used in recent years to enhance student engagement in class and thus improve their learning effectiveness. However, the lack of functions to process short text responses from the IRS prevents the further application of IRS in classes. Therefore, this study aims to propose a proper short text clustering module for the IRS, and demonstrate our implemented techniques through real-world examples, so as to provide experiences and insights for further study. In particular, we have compared three clustering methods and the result shows that theoretically better methods need not lead to better results, as there are various factors that may affect the final performance.

pdf bib
Chinese Grammatical Error Diagnosis Based on CRF and LSTM-CRF modelChinese Grammatical Error Diagnosis Based on CRF and LSTM-CRF model
Yujie Zhou | Yinan Shao | Yong Zhou

When learning Chinese as a foreign language, the learners may have some grammatical errors due to negative migration of their native languages. However, few grammar checking applications have been developed to support the learners. The goal of this paper is to develop a tool to automatically diagnose four types of grammatical errors which are redundant words (R), missing words (M), bad word selection (S) and disordered words (W) in Chinese sentences written by those foreign learners. In this paper, a conventional linear CRF model with specific feature engineering and a LSTM-CRF model are used to solve the CGED (Chinese Grammatical Error Diagnosis) task. We make some improvement on both models and the submitted results have better performance on false positive rate and accuracy than the average of all runs from CGED2018 for all three evaluation levels.

pdf bib
Detecting Simultaneously Chinese Grammar Errors Based on a BiLSTM-CRF ModelChinese Grammar Errors Based on a BiLSTM-CRF Model
Yajun Liu | Hongying Zan | Mengjie Zhong | Hongchao Ma

In the process of learning and using Chinese, many learners of Chinese as foreign language(CFL) may have grammar errors due to negative migration of their native languages. This paper introduces our system that can simultaneously diagnose four types of grammatical errors including redundant (R), missing (M), selection (S), disorder (W) in NLPTEA-5 shared task. We proposed a Bidirectional LSTM CRF neural network (BiLSTM-CRF) that combines BiLSTM and CRF without hand-craft features for Chinese Grammatical Error Diagnosis (CGED). Evaluation includes three levels, which are detection level, identification level and position level. At the detection level and identification level, our system got the third recall scores, and achieved good F1 values.

pdf bib
A Hybrid Approach Combining Statistical Knowledge with Conditional Random Fields for Chinese Grammatical Error DetectionChinese Grammatical Error Detection
Yiyi Wang | Chilin Shih

This paper presents a method of combining Conditional Random Fields (CRFs) model with a post-processing layer using Google n-grams statistical information tailored to detect word selection and word order errors made by learners of Chinese as Foreign Language (CFL). We describe the architecture of the model and its performance in the shared task of the ACL 2018 Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA). This hybrid approach yields comparably high false positive rate (FPR = 0.1274) and precision (Pd= 0.7519 ; Pi= 0.6311), but low recall (Rd = 0.3035 ; Ri = 0.1696) in grammatical error detection and identification tasks. Additional statistical information and linguistic rules can be added to enhance the model performance in the future.

pdf bib
CYUT-III Team Chinese Grammatical Error Diagnosis System Report in NLPTEA-2018 CGED Shared TaskCYUT-III Team Chinese Grammatical Error Diagnosis System Report in NLPTEA-2018 CGED Shared Task
Shih-Hung Wu | Jun-Wei Wang | Liang-Pu Chen | Ping-Che Yang

This paper reports how we build a Chinese Grammatical Error Diagnosis system in the NLPTEA-2018 CGED shared task. In 2018, we sent three runs with three different approaches. The first one is a pattern-based approach by frequent error pattern matching. The second one is a sequential labelling approach by conditional random fields (CRF). The third one is a rewriting approach by sequence to sequence (seq2seq) model. The three approaches have different properties that aim to optimize different performance metrics and the formal run results show the differences as we expected.