The SMarT Classifier for Arabic Fine-Grained Dialect IdentificationSMarT Classifier for Arabic Fine-Grained Dialect Identification
Karima Meftouh | Karima Abidi | Salima Harrat | Kamel Smaili
Proceedings of the Fourth Arabic Natural Language Processing Workshop

This paper describes the approach adopted by the SMarT research group to build a dialect identification system in the framework of the Madar shared task on Arabic fine-grained dialect identification. We experimented several approaches, but we finally decided to use a Multinomial Naive Bayes classifier based on word and character ngrams in addition to the language model probabilities. We achieved a score of 67.73 % in terms of Macro accuracy and a macro-averaged F1-score of 67.31 %


An enhanced automatic speech recognition system for ArabicArabic
Mohamed Amine Menacer | Odile Mella | Dominique Fohr | Denis Jouvet | David Langlois | Kamel Smaili
Proceedings of the Third Arabic Natural Language Processing Workshop

Automatic speech recognition for Arabic is a very challenging task. Despite all the classical techniques for Automatic Speech Recognition (ASR), which can be efficiently applied to Arabic speech recognition, it is essential to take into consideration the language specificities to improve the system performance. In this article, we focus on Modern Standard Arabic (MSA) speech recognition. We introduce the challenges related to Arabic language, namely the complex morphology nature of the language and the absence of the short vowels in written text, which leads to several potential vowelization for each graphemes, which is often conflicting. We develop an ASR system for MSA by using Kaldi toolkit. Several acoustic and language models are trained. We obtain a Word Error Rate (WER) of 14.42 for the baseline system and 12.2 relative improvement by rescoring the lattice and by rewriting the output with the right Z hamoza above or below Alif.