Marie Francine Moens

Also published as: Marie-Francine Moens


2021

pdf bib
Imposing Relation Structure in Language-Model Embeddings Using Contrastive Learning
Christos Theodoropoulos | James Henderson | Andrei Catalin Coman | Marie-Francine Moens
Proceedings of the 25th Conference on Computational Natural Language Learning

Though language model text embeddings have revolutionized NLP research, their ability to capture high-level semantic information, such as relations between entities in text, is limited. In this paper, we propose a novel contrastive learning framework that trains sentence embeddings to encode the relations in a graph structure. Given a sentence (unstructured text) and its graph, we use contrastive learning to impose relation-related structure on the token level representations of the sentence obtained with a CharacterBERT (El Boukkouri et al., 2020) model. The resulting relation-aware sentence embeddings achieve state-of-the-art results on the relation extraction task using only a simple KNN classifier, thereby demonstrating the success of the proposed method. Additional visualization by a tSNE analysis shows the effectiveness of the learned representation space compared to baselines. Furthermore, we show that we can learn a different space for named entity recognition, again using a contrastive learning objective, and demonstrate how to successfully combine both representation spaces in an entity-relation task.

pdf bib
Proceedings of the Third Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Marius Mosbach | Michael A. Hedderich | Sandro Pezzelle | Aditya Mogadala | Dietrich Klakow | Marie-Francine Moens | Zeynep Akata
Proceedings of the Third Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

pdf bib
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Marie-Francine Moens | Xuanjing Huang | Lucia Specia | Scott Wen-tau Yih
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

pdf bib
Just Ask ! Evaluating Machine Translation by Asking and Answering Questions
Mateusz Krubiński | Erfan Ghadery | Marie-Francine Moens | Pavel Pecina
Proceedings of the Sixth Conference on Machine Translation

In this paper, we show that automatically-generated questions and answers can be used to evaluate the quality of Machine Translation (MT) systems. Building on recent work on the evaluation of abstractive text summarization, we propose a new metric for system-level MT evaluation, compare it with other state-of-the-art solutions, and show its robustness by conducting experiments for various MT directions.

pdf bib
MTEQA at WMT21 Metrics Shared TaskMTEQA at WMT21 Metrics Shared Task
Mateusz Krubiński | Erfan Ghadery | Marie-Francine Moens | Pavel Pecina
Proceedings of the Sixth Conference on Machine Translation

In this paper, we describe our submission to the WMT 2021 Metrics Shared Task. We use the automatically-generated questions and answers to evaluate the quality of Machine Translation (MT) systems. Our submission builds upon the recently proposed MTEQA framework. Experiments on WMT20 evaluation datasets show that at the system-level the MTEQA metric achieves performance comparable with other state-of-the-art solutions, while considering only a certain amount of information from the whole translation.

2020

pdf bib
LIIR at SemEval-2020 Task 12 : A Cross-Lingual Augmentation Approach for Multilingual Offensive Language IdentificationLIIR at SemEval-2020 Task 12: A Cross-Lingual Augmentation Approach for Multilingual Offensive Language Identification
Erfan Ghadery | Marie-Francine Moens
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper presents our system entitled ‘LIIR’ for SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2). We have participated in sub-task A for English, Danish, Greek, Arabic, and Turkish languages. We adapt and fine-tune the BERT and Multilingual Bert models made available by Google AI for English and non-English languages respectively. For the English language, we use a combination of two fine-tuned BERT models. For other languages we propose a cross-lingual augmentation approach in order to enrich training data and we use Multilingual BERT to obtain sentence representations.

pdf bib
ECHR : Legal Corpus for Argument MiningECHR: Legal Corpus for Argument Mining
Prakash Poudyal | Jaromir Savelka | Aagje Ieven | Marie Francine Moens | Teresa Goncalves | Paulo Quaresma
Proceedings of the 7th Workshop on Argument Mining

In this paper, we publicly release an annotated corpus of 42 decisions of the European Court of Human Rights (ECHR). The corpus is annotated in terms of three types of clauses useful in argument mining : premise, conclusion, and non-argument parts of the text. Furthermore, relationships among the premises and conclusions are mapped. We present baselines for three tasks that lead from unstructured texts to structured arguments. The tasks are argument clause recognition, clause relation prediction, and premise / conclusion recognition. Despite a straightforward application of the bidirectional encoders from Transformers (BERT), we obtained very promising results F1 0.765 on argument recognition, 0.511 on relation prediction, and 0.859/0.628 on premise / conclusion recognition). The results suggest the usefulness of pre-trained language models based on deep neural network architectures in argument mining. Because of the simplicity of the baselines, there is ample space for improvement in future work based on the released corpus.

pdf bib
Decoding Language Spatial Relations to 2D Spatial ArrangementsDecoding Language Spatial Relations to 2D Spatial Arrangements
Gorjan Radevski | Guillem Collell | Marie-Francine Moens | Tinne Tuytelaars
Findings of the Association for Computational Linguistics: EMNLP 2020

We address the problem of multimodal spatial understanding by decoding a set of language-expressed spatial relations to a set of 2D spatial arrangements in a multi-object and multi-relationship setting. We frame the task as arranging a scene of clip-arts given a textual description. We propose a simple and effective model architecture Spatial-Reasoning Bert (SR-Bert), trained to decode text to 2D spatial arrangements in a non-autoregressive manner. SR-Bert can decode both explicit and implicit language to 2D spatial arrangements, generalizes to out-of-sample data to a reasonable extent and can generate complete abstract scenes if paired with a clip-arts predictor. Finally, we qualitatively evaluate our method with a user study, validating that our generated spatial arrangements align with human expectation.

pdf bib
Autoregressive Reasoning over Chains of Facts with Transformers
Ruben Cartuyvels | Graham Spinks | Marie-Francine Moens
Proceedings of the 28th International Conference on Computational Linguistics

This paper proposes an iterative inference algorithm for multi-hop explanation regeneration, that retrieves relevant factual evidence in the form of text snippets, given a natural language question and its answer. Combining multiple sources of evidence or facts for multi-hop reasoning becomes increasingly hard when the number of sources needed to make an inference grows. Our algorithm copes with this by decomposing the selection of facts from a corpus autoregressively, conditioning the next iteration on previously selected facts. This allows us to use a pairwise learning-to-rank loss. We validate our method on datasets of the TextGraphs 2019 and 2020 Shared Tasks for explanation regeneration. Existing work on this task either evaluates facts in isolation or artificially limits the possible chains of facts, thus limiting multi-hop inference. We demonstrate that our algorithm, when used with a pre-trained transformer model, outperforms the previous state-of-the-art in terms of precision, training time and inference efficiency.

pdf bib
Proceedings of the Third International Workshop on Spatial Language Understanding
Parisa Kordjamshidi | Archna Bhatia | Malihe Alikhani | Jason Baldridge | Mohit Bansal | Marie-Francine Moens
Proceedings of the Third International Workshop on Spatial Language Understanding

pdf bib
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Aditya Mogadala | Sandro Pezzelle | Dietrich Klakow | Marie-Francine Moens | Zeynep Akata
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

2019

pdf bib
Talk2Car : Taking Control of Your Self-Driving CarTalk2Car: Taking Control of Your Self-Driving Car
Thierry Deruyttere | Simon Vandenhende | Dusan Grujicic | Luc Van Gool | Marie-Francine Moens
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

A long-term goal of artificial intelligence is to have an agent execute commands communicated through natural language. In many cases the commands are grounded in a visual environment shared by the human who gives the command and the agent. Execution of the command then requires mapping the command into the physical visual space, after which the appropriate action can be taken. In this paper we consider the former. Or more specifically, we consider the problem in an autonomous driving setting, where a passenger requests an action that can be associated with an object found in a street scene. Our work presents the Talk2Car dataset, which is the first object referral dataset that contains commands written in natural language for self-driving cars. We provide a detailed comparison with related datasets such as ReferIt, RefCOCO, RefCOCO+, RefCOCOg, Cityscape-Ref and CLEVR-Ref. Additionally, we include a performance analysis using strong state-of-the-art models. The results show that the proposed object referral task is a challenging one for which the models show promising results but still require additional research in natural language processing, computer vision and the intersection of these fields. The dataset can be found on our website : http://macchina-ai.eu/

pdf bib
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Aditya Mogadala | Dietrich Klakow | Sandro Pezzelle | Marie-Francine Moens
Proceedings of the Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)

pdf bib
Learning Unsupervised Multilingual Word Embeddings with Incremental Multilingual Hubs
Geert Heyman | Bregt Verreet | Ivan Vulić | Marie-Francine Moens
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Recent research has discovered that a shared bilingual word embedding space can be induced by projecting monolingual word embedding spaces from two languages using a self-learning paradigm without any bilingual supervision. However, it has also been shown that for distant language pairs such fully unsupervised self-learning methods are unstable and often get stuck in poor local optima due to reduced isomorphism between starting monolingual spaces. In this work, we propose a new robust framework for learning unsupervised multilingual word embeddings that mitigates the instability issues. We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space. Through the gradual language addition the method can leverage the interdependencies between the new language and all other languages in the current multilingual space. We find that it is beneficial to project more distant languages later in the iterative process. Our fully unsupervised multilingual embedding spaces yield results that are on par with the state-of-the-art methods in the bilingual lexicon induction (BLI) task, and simultaneously obtain state-of-the-art scores on two downstream tasks : multilingual document classification and multilingual dependency parsing, outperforming even supervised baselines. This finding also accentuates the need to establish evaluation protocols for cross-lingual word embeddings beyond the omnipresent intrinsic BLI task in future work.

2018

pdf bib
Word-Level Loss Extensions for Neural Temporal Relation Classification
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 27th International Conference on Computational Linguistics

Unsupervised pre-trained word embeddings are used effectively for many tasks in natural language processing to leverage unlabeled textual data. Often these embeddings are either used as initializations or as fixed word representations for task-specific classification models. In this work, we extend our classification model’s task loss with an unsupervised auxiliary loss on the word-embedding level of the model. This is to ensure that the learned word representations contain both task-specific features, learned from the supervised loss component, and more general features learned from the unsupervised loss component. We evaluate our approach on the task of temporal relation extraction, in particular, narrative containment relation extraction from clinical records, and show that continued training of the embeddings on the unsupervised objective together with the task objective gives better task-specific embeddings, and results in an improvement over the state of the art on the THYME dataset, using only a general-domain part-of-speech tagger as linguistic resource.

pdf bib
Learning Representations Specialized in Spatial Knowledge : Leveraging Language and Vision
Guillem Collell | Marie-Francine Moens
Transactions of the Association for Computational Linguistics, Volume 6

Spatial understanding is crucial in many real-world problems, yet little progress has been made towards building representations that capture spatial knowledge. Here, we move one step forward in this direction and learn such representations by leveraging a task consisting in predicting continuous 2D spatial arrangements of objects given object-relationship-object instances (e.g., cat under chair) and a simple neural network model that learns the task from annotated images. We show that the model succeeds in this task and, furthermore, that it is capable of predicting correct spatial arrangements for unseen objects if either CNN features or word embeddings of the objects are provided. The differences between visual and linguistic features are discussed. Next, to evaluate the spatial representations learned in the previous task, we introduce a task and a dataset consisting in a set of crowdsourced human ratings of spatial similarity for object pairs. We find that both CNN (convolutional neural network) features and word embeddings predict human judgments of similarity well and that these vectors can be further specialized in spatial knowledge if we update them when training the model that predicts spatial arrangements of objects. Overall, this paper paves the way towards building distributed spatial representations, contributing to the understanding of spatial expressions in language.

pdf bib
Proceedings of the First International Workshop on Spatial Language Understanding
Parisa Kordjamshidi | Archna Bhatia | James Pustejovsky | Marie-Francine Moens
Proceedings of the First International Workshop on Spatial Language Understanding

pdf bib
Evaluating Textual Representations through Image Generation
Graham Spinks | Marie-Francine Moens
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

We present a methodology for determining the quality of textual representations through the ability to generate images from them. Continuous representations of textual input are ubiquitous in modern Natural Language Processing techniques either at the core of machine learning algorithms or as the by-product at any given layer of a neural network. While current techniques to evaluate such representations focus on their performance on particular tasks, they do n’t provide a clear understanding of the level of informational detail that is stored within them, especially their ability to represent spatial information. The central premise of this paper is that visual inspection or analysis is the most convenient method to quickly and accurately determine information content. Through the use of text-to-image neural networks, we propose a new technique to compare the quality of textual representations by visualizing their information content. The method is illustrated on a medical dataset where the correct representation of spatial information and shorthands are of particular importance. For four different well-known textual representations, we show with a quantitative analysis that some representations are consistently able to deliver higher quality visualizations of the information content. Additionally, we show that the quantitative analysis technique correlates with the judgment of a human expert evaluator in terms of alignment.

pdf bib
Generating Continuous Representations of Medical Texts
Graham Spinks | Marie-Francine Moens
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

We present an architecture that generates medical texts while learning an informative, continuous representation with discriminative features. During training the input to the system is a dataset of captions for medical X-Rays. The acquired continuous representations are of particular interest for use in many machine learning techniques where the discrete and high-dimensional nature of textual input is an obstacle. We use an Adversarially Regularized Autoencoder to create realistic text in both an unconditional and conditional setting. We show that this technique is applicable to medical texts which often contain syntactic and domain-specific shorthands. A quantitative evaluation shows that we achieve a lower model perplexity than a traditional LSTM generator.

2017

pdf bib
Improving Implicit Semantic Role Labeling by Predicting Semantic Frame Arguments
Quynh Ngoc Thi Do | Steven Bethard | Marie-Francine Moens
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Implicit semantic role labeling (iSRL) is the task of predicting the semantic roles of a predicate that do not appear as explicit arguments, but rather regard common sense knowledge or are mentioned earlier in the discourse. We introduce an approach to iSRL based on a predictive recurrent neural semantic frame model (PRNSFM) that uses a large unannotated corpus to learn the probability of a sequence of semantic arguments given a predicate. We leverage the sequence probabilities predicted by the PRNSFM to estimate selectional preferences for predicates and their arguments. On the NomBank iSRL test set, our approach improves state-of-the-art performance on implicit semantic role labeling with less reliance than prior work on manually constructed language resources.

pdf bib
Learning to Recognize Animals by Watching Documentaries : Using Subtitles as Weak Supervision
Aparna Nurani Venkitasubramanian | Tinne Tuytelaars | Marie-Francine Moens
Proceedings of the Sixth Workshop on Vision and Language

We investigate animal recognition models learned from wildlife video documentaries by using the weak supervision of the textual subtitles. This is a particularly challenging setting, since i) the animals occur in their natural habitat and are often largely occluded and ii) subtitles are to a large degree complementary to the visual content, providing a very weak supervisory signal. This is in contrast to most work on integrated vision and language in the literature, where textual descriptions are tightly linked to the image content, and often generated in a curated fashion for the task at hand. In particular, we investigate different image representations and models, including a support vector machine on top of activations of a pretrained convolutional neural network, as well as a Naive Bayes framework on a ‘bag-of-activations’ image representation, where each element of the bag is considered separately. This representation allows key components in the image to be isolated, in spite of largely varying backgrounds and image clutter, without an object detection or image segmentation step. The methods are evaluated based on how well they transfer to unseen camera-trap images captured across diverse topographical regions under different environmental conditions and illumination settings, involving a large domain shift.

pdf bib
Structured Learning for Temporal Relation Extraction from Clinical Records
Artuur Leeuwenberg | Marie-Francine Moens
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

We propose a scalable structured learning model that jointly predicts temporal relations between events and temporal expressions (TLINKS), and the relation between these events and the document creation time (DCTR). We employ a structured perceptron, together with integer linear programming constraints for document-level inference during training and prediction to exploit relational properties of temporality, together with global learning of the relations at the document level. Moreover, this study gives insights in the results of integrating constraints for temporal relation extraction when using structured learning and prediction. Our best system outperforms the state-of-the art on both the CONTAINS TLINK task, and the DCTR task.