Do n’t Throw Those Morphological Analyzers Away Just Yet : Neural Morphological Disambiguation for ArabicArabic

Nasser Zalmout, Nizar Habash


Abstract
This paper presents a model for Arabic morphological disambiguation based on Recurrent Neural Networks (RNN). We train Long Short-Term Memory (LSTM) cells in several configurations and embedding levels to model the various morphological features. Our experiments show that these models outperform state-of-the-art systems without explicit use of feature engineering. However, adding learning features from a morphological analyzer to model the space of possible analyses provides additional improvement. We make use of the resulting morphological models for scoring and ranking the analyses of the morphological analyzer for morphological disambiguation. The results show significant gains in accuracy across several evaluation metrics. Our system results in 4.4 % absolute increase over the state-of-the-art in full morphological analysis accuracy (30.6 % relative error reduction), and 10.6 % (31.5 % relative error reduction) for out-of-vocabulary words.
Anthology ID:
D17-1073
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
704–713
Language:
URL:
https://aclanthology.org/D17-1073
DOI:
10.18653/v1/D17-1073
Bibkey:
Cite (ACL):
Nasser Zalmout and Nizar Habash. 2017. Do n’t Throw Those Morphological Analyzers Away Just Yet : Neural Morphological Disambiguation for ArabicArabic. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 704–713, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Do n’t Throw Those Morphological Analyzers Away Just Yet : Neural Morphological Disambiguation for ArabicArabic (Zalmout & Habash, EMNLP 2017)
Copy Citation:
PDF:
https://aclanthology.org/D17-1073.pdf
Terminologies: