Thierry Hamon


2020

pdf bib
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes
Rémi Cardon | Natalia Grabar | Cyril Grouin | Thierry Hamon
Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Atelier DÉfi Fouille de Textes

2019

pdf bib
RNN Embeddings for Identifying Difficult to Understand Medical WordsRNN Embeddings for Identifying Difficult to Understand Medical Words
Hanna Pylieva | Artem Chernodub | Natalia Grabar | Thierry Hamon
Proceedings of the 18th BioNLP Workshop and Shared Task

Patients and their families often require a better understanding of medical information provided by doctors. We currently address this issue by improving the identification of difficult to understand medical words. We introduce novel embeddings received from RNN-FrnnMUTE (French RNN Medical Understandability Text Embeddings) which allow to reach up to 87.0 F1 score in identification of difficult words. We also note that adding pre-trained FastText word embeddings to the feature set substantially improves the performance of the model which classifies words according to their difficulty. We study the generalizability of different models through three cross-validation scenarios which allow testing classifiers in real-world conditions : understanding of medical words by new users, and classification of new unseen words by the automatic models. The RNN-FrnnMUTE embeddings and the categorization code are being made available for the research.