Lexical and Computational Semantics and Semantic Evaluation (formerly Workshop on Sense Evaluation) (2018)


up

pdf (full)
bib (full)
Proceedings of The 12th International Workshop on Semantic Evaluation

pdf bib
Proceedings of The 12th International Workshop on Semantic Evaluation
Marianna Apidianaki | Saif M. Mohammad | Jonathan May | Ekaterina Shutova | Steven Bethard | Marine Carpuat

pdf bib
SemEval-2018 Task 1 : Affect in TweetsSemEval-2018 Task 1: Affect in Tweets
Saif Mohammad | Felipe Bravo-Marquez | Mohammad Salameh | Svetlana Kiritchenko

We present the SemEval-2018 Task 1 : Affect in Tweets, which includes an array of subtasks on inferring the affectual state of a person from their tweet. For each task, we created labeled data from English, Arabic, and Spanish tweets. The individual tasks are : 1. emotion intensity regression, 2. emotion intensity ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. emotion classification. Seventy-five teams (about 200 team members) participated in the shared task. We summarize the methods, resources, and tools used by the participating teams, with a focus on the techniques and resources that are particularly useful. We also analyze systems for consistent bias towards a particular race or gender. The data is made freely available to further improve our understanding of how people convey emotions through language.

pdf bib
SeerNet at SemEval-2018 Task 1 : Domain Adaptation for Affect in TweetsSeerNet at SemEval-2018 Task 1: Domain Adaptation for Affect in Tweets
Venkatesh Duppada | Royal Jain | Sushant Hiray

The paper describes the best performing system for the SemEval-2018 Affect in Tweets(English) sub-tasks. The system focuses on the ordinal classification and regression sub-tasks for valence and emotion. For ordinal classification valence is classified into 7 different classes ranging from -3 to 3 whereas emotion is classified into 4 different classes 0 to 3 separately for each emotion namely anger, fear, joy and sadness. The regression sub-tasks estimate the intensity of valence and each emotion. The system performs domain adaptation of 4 different models and creates an ensemble to give the final prediction. The proposed system achieved 1stposition out of 75 teams which participated in the fore-mentioned sub-tasks. We outperform the baseline model by margins ranging from 49.2 % to 76.4 %, thus, pushing the state-of-the-art significantly.

pdf bib
SemEval 2018 Task 2 : Multilingual Emoji PredictionSemEval 2018 Task 2: Multilingual Emoji Prediction
Francesco Barbieri | Jose Camacho-Collados | Francesco Ronzano | Luis Espinosa-Anke | Miguel Ballesteros | Valerio Basile | Viviana Patti | Horacio Saggion

This paper describes the results of the first Shared Task on Multilingual Emoji Prediction, organized as part of SemEval 2018. Given the text of a tweet, the task consists of predicting the most likely emoji to be used along such tweet. Two subtasks were proposed, one for English and one for Spanish, and participants were allowed to submit a system run to one or both subtasks. In total, 49 teams participated to the English subtask and 22 teams submitted a system run to the Spanish subtask. Evaluation was carried out emoji-wise, and the final ranking was based on macro F-Score. Data and further information about this task can be found at.https://competitions.codalab.org/competitions/17344.\n

pdf bib
Tbingen-Oslo at SemEval-2018 Task 2 : SVMs perform better than RNNs in Emoji PredictionTübingen-Oslo at SemEval-2018 Task 2: SVMs perform better than RNNs in Emoji Prediction
Çağrı Çöltekin | Taraka Rama

This paper describes our participation in the SemEval-2018 task Multilingual Emoji Prediction. We participated in both English and Spanish subtasks, experimenting with support vector machines (SVMs) and recurrent neural networks. Our SVM classifier obtained the top rank in both subtasks with macro-averaged F1-measures of 35.99 % for English and 22.36 % for Spanish data sets. Similar to a few earlier attempts, the results with neural networks were not on par with linear SVMs.

pdf bib
SemEval-2018 Task 3 : Irony Detection in English TweetsSemEval-2018 Task 3: Irony Detection in English Tweets
Cynthia Van Hee | Els Lefever | Véronique Hoste

This paper presents the first shared task on irony detection : given a tweet, automatic natural language processing systems should determine whether the tweet is ironic (Task A) and which type of irony (if any) is expressed (Task B). The ironic tweets were collected using irony-related hashtags (i.e. # irony, # sarcasm, # not) and were subsequently manually annotated to minimise the amount of noise in the corpus. Prior to distributing the data, hashtags that were used to collect the tweets were removed from the corpus. For both tasks, a training corpus of 3,834 tweets was provided, as well as a test set containing 784 tweets. Our shared tasks received submissions from 43 teams for the binary classification Task A and from 31 teams for the multiclass Task B. The highest classification scores obtained for both subtasks are respectively F1= 0.71 and F1= 0.51 and demonstrate that fine-grained irony classification is much more challenging than binary irony detection.

pdf bib
THU_NGN at SemEval-2018 Task 3 : Tweet Irony Detection with Densely connected LSTM and Multi-task LearningTHU_NGN at SemEval-2018 Task 3: Tweet Irony Detection with Densely connected LSTM and Multi-task Learning
Chuhan Wu | Fangzhao Wu | Sixing Wu | Junxin Liu | Zhigang Yuan | Yongfeng Huang

Detecting irony is an important task to mine fine-grained information from social web messages. Therefore, the Semeval-2018 task 3 is aimed to detect the ironic tweets (subtask A) and their ironic types (subtask B). In order to address this task, we propose a system based on a densely connected LSTM network with multi-task learning strategy. In our dense LSTM model, each layer will take all outputs from previous layers as input. The last LSTM layer will output the hidden representations of texts, and they will be used in three classification task. In addition, we incorporate several types of features to improve the model performance. Our model achieved an F-score of 70.54 (ranked 2/43) in the subtask A and 49.47 (ranked 3/29) in the subtask B. The experimental results validate the effectiveness of our system.

pdf bib
SemEval 2018 Task 4 : Character Identification on Multiparty DialoguesSemEval 2018 Task 4: Character Identification on Multiparty Dialogues
Jinho D. Choi | Henry Y. Chen

Character identification is a task of entity linking that finds the global entity of each personal mention in multiparty dialogue. For this task, the first two seasons of the popular TV show Friends are annotated, comprising a total of 448 dialogues, 15,709 mentions, and 401 entities. The personal mentions are detected from nominals referring to certain characters in the show, and the entities are collected from the list of all characters in those two seasons of the show. This task is challenging because it requires the identification of characters that are mentioned but may not be active during the conversation. Among 90 + participants, four of them submitted their system outputs and showed strengths in different aspects about the task. Thorough analyses of the distributed datasets, system outputs, and comparative studies are also provided. To facilitate the momentum, we create an open-source project for this task and publicly release a larger and cleaner dataset, hoping to support researchers for more enhanced modeling.

pdf bib
AMORE-UPF at SemEval-2018 Task 4 : BiLSTM with Entity LibraryAMORE-UPF at SemEval-2018 Task 4: BiLSTM with Entity Library
Laura Aina | Carina Silberer | Ionut-Teodor Sorodoc | Matthijs Westera | Gemma Boleda

This paper describes our winning contribution to SemEval 2018 Task 4 : Character Identification on Multiparty Dialogues. It is a simple, standard model with one key innovation, an entity library. Our results show that this innovation greatly facilitates the identification of infrequent characters. Because of the generic nature of our model, this finding is potentially relevant to any task that requires the effective learning from sparse or imbalanced data.

pdf bib
KOI at SemEval-2018 Task 5 : Building Knowledge Graph of IncidentsKOI at SemEval-2018 Task 5: Building Knowledge Graph of Incidents
Paramita Mirza | Fariz Darari | Rahmad Mahendra

We present KOI (Knowledge of Incidents), a system that given news articles as input, builds a knowledge graph (KOI-KG) of incidental events. KOI-KG can then be used to efficiently answer questions such How many killing incidents happened in 2017 that involve Sean? The required steps in building the KG include : (i) document preprocessing involving word sense disambiguation, named-entity recognition, temporal expression recognition and normalization, and semantic role labeling ; (ii) incidental event extraction and coreference resolution via document clustering ; and (iii) KG construction and population.

pdf bib
NEUROSENT-PDI at SemEval-2018 Task 1 : Leveraging a Multi-Domain Sentiment Model for Inferring Polarity in Micro-blog TextNEUROSENT-PDI at SemEval-2018 Task 1: Leveraging a Multi-Domain Sentiment Model for Inferring Polarity in Micro-blog Text
Mauro Dragoni

This paper describes the NeuroSent system that participated in SemEval 2018 Task 1. Our system takes a supervised approach that builds on neural networks and word embeddings. Word embeddings were built by starting from a repository of user generated reviews. Thus, they are specific for sentiment analysis tasks. Then, tweets are converted in the corresponding vector representation and given as input to the neural network with the aim of learning the different semantics contained in each emotion taken into account by the SemEval task. The output layer has been adapted based on the characteristics of each subtask. Preliminary results obtained on the provided training set are encouraging for pursuing the investigation into this direction.

pdf bib
FOI DSS at SemEval-2018 Task 1 : Combining LSTM States, Embeddings, and Lexical Features for Affect AnalysisFOI DSS at SemEval-2018 Task 1: Combining LSTM States, Embeddings, and Lexical Features for Affect Analysis
Maja Karasalo | Mattias Nilsson | Magnus Rosell | Ulrika Wickenberg Bolin

This paper describes the system used and results obtained for team FOI DSS at SemEval-2018 Task 1 : Affect In Tweets. The team participated in all English language subtasks, with a method utilizing transfer learning from LSTM nets trained on large sentiment datasets combined with embeddings and lexical features. For four out of five subtasks, the system performed in the range of 92-95 % of the winning systems, in terms of the competition metrics. Analysis of the results suggests that improved pre-processing and addition of more lexical features may further elevate performance.

pdf bib
NLPZZX at SemEval-2018 Task 1 : Using Ensemble Method for Emotion and Sentiment Intensity DeterminationNLPZZX at SemEval-2018 Task 1: Using Ensemble Method for Emotion and Sentiment Intensity Determination
Zhengxin Zhang | Qimin Zhou | Hao Wu

In this paper, we put forward a system that competed at SemEval-2018 Task 1 : Affect in Tweets. Our system uses a simple yet effective ensemble method which combines several neural network components. We participate in two subtasks for English tweets : EI-reg and V-reg. For two subtasks, different combinations of neural components are examined. For EI-reg, our system achieves an accuracy of 0.727 in Pearson Correlation Coefficient (all instances) and an accuracy of 0.555 in Pearson Correlation Coefficient (0.5-1). For V-reg, the achieved accuracy scores are respectively 0.835 and 0.670

pdf bib
LT3 at SemEval-2018 Task 1 : A classifier chain to detect emotions in tweetsLT3 at SemEval-2018 Task 1: A classifier chain to detect emotions in tweets
Luna De Bruyne | Orphée De Clercq | Véronique Hoste

This paper presents an emotion classification system for English tweets, submitted for the SemEval shared task on Affect in Tweets, subtask 5 : Detecting Emotions. The system combines lexicon, n-gram, style, syntactic and semantic features. For this multi-class multi-label problem, we created a classifier chain. This is an ensemble of eleven binary classifiers, one for each possible emotion category, where each model gets the predictions of the preceding models as additional features. The predicted labels are combined to get a multi-label representation of the predictions. Our system was ranked eleventh among thirty five participating teams, with a Jaccard accuracy of 52.0 % and macro- and micro-average F1-scores of 49.3 % and 64.0 %, respectively.

pdf bib
SINAI at SemEval-2018 Task 1 : Emotion Recognition in TweetsSINAI at SemEval-2018 Task 1: Emotion Recognition in Tweets
Flor Miriam Plaza-del-Arco | Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López

Emotion classification is a new task that combines several disciplines including Artificial Intelligence and Psychology, although Natural Language Processing is perhaps the most challenging area. In this paper, we describe our participation in SemEval-2018 Task1 : Affect in Tweets. In particular, we have participated in EI-oc, EI-reg and E-c subtasks for English and Spanish languages.

pdf bib
INGEOTEC at SemEval-2018 Task 1 : EvoMSA and TC for Sentiment AnalysisINGEOTEC at SemEval-2018 Task 1: EvoMSA and μTC for Sentiment Analysis
Mario Graff | Sabino Miranda-Jiménez | Eric S. Tellez | Daniela Moctezuma

This paper describes our participation in Affective Tweets task for emotional intensity and sentiment intensity subtasks for English, Spanish, and Arabic languages. We used two approaches, TC and EvoMSA. The first one is a generic text categorization and regression system ; and the second one, a two-stage architecture for Sentiment Analysis. Both approaches are multilingual and domain independent.

pdf bib
Tw-StAR at SemEval-2018 Task 1 : Preprocessing Impact on Multi-label Emotion ClassificationStAR at SemEval-2018 Task 1: Preprocessing Impact on Multi-label Emotion Classification
Hala Mulki | Chedi Bechikh Ali | Hatem Haddad | Ismail Babaoğlu

In this paper, we describe our contribution in SemEval-2018 contest. We tackled task 1 Affect in Tweets, subtask E-c Detecting Emotions (multi-label classification). A multilabel classification system Tw-StAR was developed to recognize the emotions embedded in Arabic, English and Spanish tweets. To handle the multi-label classification problem via traditional classifiers, we employed the binary relevance transformation strategy while a TF-IDF scheme was used to generate the tweets’ features. We investigated using single and combinations of several preprocessing tasks to further improve the performance. The results showed that specific combinations of preprocessing tasks could significantly improve the evaluation measures. This has been later emphasized by the official results as our system ranked 3rd for both Arabic and Spanish datasets and 14th for the English dataset.

pdf bib
EmoIntens Tracker at SemEval-2018 Task 1 : Emotional Intensity Levels in # TweetsEmoIntens Tracker at SemEval-2018 Task 1: Emotional Intensity Levels in #Tweets
Ramona-Andreea Turcu | Sandra Maria Amarandei | Iuliana-Alexandra Flescan-Lovin-Arseni | Daniela Gifu | Diana Trandabat

The Affect in Tweets task is centered on emotions categorization and evaluation matrix using multi-language tweets (English and Spanish). In this research, SemEval Affect dataset was preprocessed, categorized, and evaluated accordingly (precision, recall, and accuracy). The system described in this paper is based on the implementation of supervised machine learning (Naive Bayes, KNN and SVM), deep learning (NN Tensor Flow model), and decision trees algorithms.

pdf bib
THU_NGN at SemEval-2018 Task 1 : Fine-grained Tweet Sentiment Intensity Analysis with Attention CNN-LSTMTHU_NGN at SemEval-2018 Task 1: Fine-grained Tweet Sentiment Intensity Analysis with Attention CNN-LSTM
Chuhan Wu | Fangzhao Wu | Junxin Liu | Zhigang Yuan | Sixing Wu | Yongfeng Huang

Traditional sentiment analysis approaches mainly focus on classifying the sentiment polarities or emotion categories of texts. However, they ca n’t exploit the sentiment intensity information. Therefore, the SemEval-2018 Task 1 is aimed to automatically determine the intensity of emotions or sentiment of tweets to mine fine-grained sentiment information. In order to address this task, we propose a system based on an attention CNN-LSTM model. In our model, LSTM is used to extract the long-term contextual information from texts. We apply attention techniques to selecting this information. A CNN layer with different size of kernels is used to extract local features. The dense layers take the pooled CNN feature maps and predict the intensity scores. Our system reaches average Pearson correlation score of 0.722 (ranked 12/48) in emotion intensity regression task, and 0.810 in valence regression task (ranked 15/38). It indicates that our system can be further extended.

pdf bib
EiTAKA at SemEval-2018 Task 1 : An Ensemble of N-Channels ConvNet and XGboost Regressors for Emotion Analysis of TweetsEiTAKA at SemEval-2018 Task 1: An Ensemble of N-Channels ConvNet and XGboost Regressors for Emotion Analysis of Tweets
Mohammed Jabreel | Antonio Moreno

This paper describes our system that has been used in Task1 Affect in Tweets. We combine two different approaches. The first one called N-Stream ConvNets, which is a deep learning approach where the second one is XGboost regressor based on a set of embedding and lexicons based features. Our system was evaluated on the testing sets of the tasks outperforming all other approaches for the Arabic version of valence intensity regression task and valence ordinal classification task.

pdf bib
CENTEMENT at SemEval-2018 Task 1 : Classification of Tweets using Multiple Thresholds with Self-correction and Weighted Conditional ProbabilitiesCENTEMENT at SemEval-2018 Task 1: Classification of Tweets using Multiple Thresholds with Self-correction and Weighted Conditional Probabilities
Tariq Ahmad | Allan Ramsay | Hanady Ahmed

In this paper we present our contribution to SemEval-2018, a classifier for classifying multi-label emotions of Arabic and English tweets. We attempted Affect in Tweets, specifically Task E-c : Detecting Emotions (multi-label classification). Our method is based on preprocessing the tweets and creating word vectors combined with a self correction step to remove noise. We also make use of emotion specific thresholds. The final submission was selected upon the best performance achieved, selected when using a range of thresholds. Our system was evaluated on the Arabic and English datasets provided for the task by the competition organisers, where it ranked 2nd for the Arabic dataset (out of 14 entries) and 12th for the English dataset (out of 35 entries).

pdf bib
Yuan at SemEval-2018 Task 1 : Tweets Emotion Intensity Prediction using Ensemble Recurrent Neural NetworkSemEval-2018 Task 1: Tweets Emotion Intensity Prediction using Ensemble Recurrent Neural Network
Min Wang | Xiaobing Zhou

We perform the LSTM and BiLSTM model for the emotion intensity prediction. We only join the third subtask in Task 1 : Affect in Tweets. Our system rank 6th among all the teams.

pdf bib
Amobee at SemEval-2018 Task 1 : GRU Neural Network with a CNN Attention Mechanism for Sentiment ClassificationAmobee at SemEval-2018 Task 1: GRU Neural Network with a CNN Attention Mechanism for Sentiment Classification
Alon Rozental | Daniel Fleischer

This paper describes the participation of Amobee in the shared sentiment analysis task at SemEval 2018. We participated in all the English sub-tasks and the Spanish valence tasks. Our system consists of three parts : training task-specific word embeddings, training a model consisting of gated-recurrent-units (GRU) with a convolution neural network (CNN) attention mechanism and training stacking-based ensembles for each of the sub-tasks. Our algorithm reached the 3rd and 1st places in the valence ordinal classification sub-tasks in English and Spanish, respectively.

pdf bib
ECNU at SemEval-2018 Task 1 : Emotion Intensity Prediction Using Effective Features and Machine Learning ModelsECNU at SemEval-2018 Task 1: Emotion Intensity Prediction Using Effective Features and Machine Learning Models
Huimin Xu | Man Lan | Yuanbin Wu

This paper describes our submissions to SemEval 2018 task 1. The task is affect intensity prediction in tweets, including five subtasks. We participated in all subtasks of English tweets. We extracted several traditional NLP, sentiment lexicon, emotion lexicon and domain specific features from tweets, adopted supervised machine learning algorithms to perform emotion intensity prediction.

pdf bib
NTUA-SLP at SemEval-2018 Task 1 : Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer LearningNTUA-SLP at SemEval-2018 Task 1: Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer Learning
Christos Baziotis | Athanasiou Nikolaos | Alexandra Chronopoulou | Athanasia Kolovou | Georgios Paraskevopoulos | Nikolaos Ellinas | Shrikanth Narayanan | Alexandros Potamianos

In this paper we present deep-learning models that submitted to the SemEval-2018 Task 1 competition : Affect in Tweets. We participated in all subtasks for English tweets. We propose a Bi-LSTM architecture equipped with a multi-layer self attention mechanism. The attention mechanism improves the model performance and allows us to identify salient words in tweets, as well as gain insight into the models making them more interpretable. Our model utilizes a set of word2vec word embeddings trained on a large collection of 550 million Twitter messages, augmented by a set of word affective features. Due to the limited amount of task-specific training data, we opted for a transfer learning approach by pretraining the Bi-LSTMs on the dataset of Semeval 2017, Task 4A. The proposed approach ranked 1st in Subtask E Multi-Label Emotion Classification, 2nd in Subtask A Emotion Intensity Regression and achieved competitive results in other subtasks.

pdf bib
CrystalFeel at SemEval-2018 Task 1 : Understanding and Detecting Emotion Intensity using Affective LexiconsCrystalFeel at SemEval-2018 Task 1: Understanding and Detecting Emotion Intensity using Affective Lexicons
Raj Kumar Gupta | Yinping Yang

While sentiment and emotion analysis has received a considerable amount of research attention, the notion of understanding and detecting the intensity of emotions is relatively less explored. This paper describes a system developed for predicting emotion intensity in tweets. Given a Twitter message, CrystalFeel uses features derived from parts-of-speech, n-grams, word embedding, and multiple affective lexicons including Opinion Lexicon, SentiStrength, AFFIN, NRC Emotion & Hash Emotion, and our in-house developed EI Lexicons to predict the degree of the intensity associated with fear, anger, sadness, and joy in the tweet. We found that including the affective lexicons-based features allowed the system to obtain strong prediction performance, while revealing interesting emotion word-level and message-level associations. On gold test data, CrystalFeel obtained Pearson correlations of 0.717 on average emotion intensity and of 0.816 on sentiment intensity.

pdf bib
PlusEmo2Vec at SemEval-2018 Task 1 : Exploiting emotion knowledge from emoji and # hashtagsPlusEmo2Vec at SemEval-2018 Task 1: Exploiting emotion knowledge from emoji and #hashtags
Ji Ho Park | Peng Xu | Pascale Fung

This paper describes our system that has been submitted to SemEval-2018 Task 1 : Affect in Tweets (AIT) to solve five subtasks. We focus on modeling both sentence and word level representations of emotion inside texts through large distantly labeled corpora with emojis and hashtags. We transfer the emotional knowledge by exploiting neural network models as feature extractors and use these representations for traditional machine learning models such as support vector regression (SVR) and logistic regression to solve the competition tasks. Our system is placed among the Top3 for all subtasks we participated.

pdf bib
YNU-HPCC at SemEval-2018 Task 1 : BiLSTM with Attention based Sentiment Analysis for Affect in TweetsYNU-HPCC at SemEval-2018 Task 1: BiLSTM with Attention based Sentiment Analysis for Affect in Tweets
You Zhang | Jin Wang | Xuejie Zhang

We implemented the sentiment system in all five subtasks for English and Spanish. All subtasks involve emotion or sentiment intensity prediction (regression and ordinal classification) and emotions determining (multi-labels classification). The useful BiLSTM (Bidirectional Long-Short Term Memory) model with attention mechanism was mainly applied for our system. We use BiLSTM in order to get word information extracted from both directions. The attention mechanism was used to find the contribution of each word for improving the scores. Furthermore, based on BiLSTMATT (BiLSTM with attention mechanism) a few deep-learning algorithms were employed for different subtasks. For regression and ordinal classification tasks we used domain adaptation and ensemble learning methods to leverage base model. While a single base model was used for multi-labels task.

pdf bib
UG18 at SemEval-2018 Task 1 : Generating Additional Training Data for Predicting Emotion Intensity in SpanishUG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish
Marloes Kuijper | Mike van Lenthe | Rik van Noord

The present study describes our submission to SemEval 2018 Task 1 : Affect in Tweets. Our Spanish-only approach aimed to demonstrate that it is beneficial to automatically generate additional training data by (i) translating training data from other languages and (ii) applying a semi-supervised learning method. We find strong support for both approaches, with those models outperforming our regular models in all subtasks. However, creating a stepwise ensemble of different models as opposed to simply averaging did not result in an increase in performance. We placed second (EI-Reg), second (EI-Oc), fourth (V-Reg) and fifth (V-Oc) in the four Spanish subtasks we participated in.

pdf bib
ISCLAB at SemEval-2018 Task 1 : UIR-Miner for Affect in TweetsISCLAB at SemEval-2018 Task 1: UIR-Miner for Affect in Tweets
Meng Li | Zhenyuan Dong | Zhihao Fan | Kongming Meng | Jinghua Cao | Guanqi Ding | Yuhan Liu | Jiawei Shan | Binyang Li

This paper presents a UIR-Miner system for emotion and sentiment analysis evaluation in Twitter in SemEval 2018. Our system consists of three main modules : preprocessing module, stacking module to solve the intensity prediction of emotion and sentiment, LSTM network module to solve multi-label classification, and the hierarchical attention network module for solving emotion and sentiment classification problem. According to the metrics of SemEval 2018, our system gets the final scores of 0.636, 0.531, 0.731, 0.708, and 0.408 on 5 subtasks, respectively.

pdf bib
TCS Research at SemEval-2018 Task 1 : Learning Robust Representations using Multi-Attention ArchitectureTCS Research at SemEval-2018 Task 1: Learning Robust Representations using Multi-Attention Architecture
Hardik Meisheri | Lipika Dey

This paper presents system description of our submission to the SemEval-2018 task-1 : Affect in tweets for the English language. We combine three different features generated using deep learning models and traditional methods in support vector machines to create a unified ensemble system. A robust representation of a tweet is learned using a multi-attention based architecture which uses a mixture of different pre-trained embeddings. In addition to this analysis of different features is also presented. Our system ranked 2nd, 5th, and 7th in different subtasks among 75 teams.

pdf bib
DMCB at SemEval-2018 Task 1 : Transfer Learning of Sentiment Classification Using Group LSTM for Emotion Intensity predictionDMCB at SemEval-2018 Task 1: Transfer Learning of Sentiment Classification Using Group LSTM for Emotion Intensity prediction
Youngmin Kim | Hyunju Lee

This paper describes a system attended in the SemEval-2018 Task 1 Affect in tweets that predicts emotional intensities. We use Group LSTM with an attention model and transfer learning with sentiment classification data as a source data (SemEval 2017 Task 4a). A transfer model structure consists of a source domain and a target domain. Additionally, we try a new dropout that is applied to LSTMs in the Group LSTM. Our system ranked 8th at the subtask 1a (emotion intensity regression). We also show various results with different architectures in the source, target and transfer models.

pdf bib
DeepMiner at SemEval-2018 Task 1 : Emotion Intensity Recognition Using Deep Representation LearningDeepMiner at SemEval-2018 Task 1: Emotion Intensity Recognition Using Deep Representation Learning
Habibeh Naderi | Behrouz Haji Soleimani | Saif Mohammad | Svetlana Kiritchenko | Stan Matwin

In this paper, we propose a regression system to infer the emotion intensity of a tweet. We develop a multi-aspect feature learning mechanism to capture the most discriminative semantic features of a tweet as well as the emotion information conveyed by each word in it. We combine six types of feature groups : (1) a tweet representation learned by an LSTM deep neural network on the training data, (2) a tweet representation learned by an LSTM network on a large corpus of tweets that contain emotion words (a distant supervision corpus), (3) word embeddings trained on the distant supervision corpus and averaged over all words in a tweet, (4) word and character n-grams, (5) features derived from various sentiment and emotion lexicons, and (6) other hand-crafted features. As part of the word embedding training, we also learn the distributed representations of multi-word expressions (MWEs) and negated forms of words. An SVR regressor is then trained over the full set of features. We evaluate the effectiveness of our ensemble feature sets on the SemEval-2018 Task 1 datasets and achieve a Pearson correlation of 72 % on the task of tweet emotion intensity prediction.

pdf bib
Zewen at SemEval-2018 Task 1 : An Ensemble Model for Affect Prediction in TweetsSemEval-2018 Task 1: An Ensemble Model for Affect Prediction in Tweets
Zewen Chi | Heyan Huang | Jiangui Chen | Hao Wu | Ran Wei

This paper presents a method for Affect in Tweets, which is the task to automatically determine the intensity of emotions and intensity of sentiment of tweets. The term affect refers to emotion-related categories such as anger, fear, etc. Intensity of emo-tions need to be quantified into a real valued score in [ 0, 1 ]. We propose an en-semble system including four different deep learning methods which are CNN, Bidirectional LSTM (BLSTM), LSTM-CNN and a CNN-based Attention model (CA). Our system gets an average Pearson correlation score of 0.682 in the subtask EI-reg and an average Pearson correlation score of 0.784 in subtask V-reg, which ranks 17th among 48 systems in EI-reg and 19th among 38 systems in V-reg.

pdf bib
ARB-SEN at SemEval-2018 Task1 : A New Set of Features for Enhancing the Sentiment Intensity Prediction in Arabic TweetsARB-SEN at SemEval-2018 Task1: A New Set of Features for Enhancing the Sentiment Intensity Prediction in Arabic Tweets
El Moatez Billah Nagoudi

This article describes our proposed Arabic Sentiment Analysis system named ARB-SEN. This system is designed for the International Workshop on Semantic Evaluation 2018 (SemEval-2018), Task1 : Affect in Tweets. ARB-SEN proposes two supervised models to estimate the sentiment intensity in Arabic tweets. Both models use a set of features including sentiment lexicon, negation, word embedding and emotion symbols features. Our system combines these features to assist the sentiment analysis task. ARB-SEN system achieves a correlation score of 0.720, ranking 6th among all participants in the valence intensity regression (V-reg) for the Arabic sub-task organized within the SemEval 2018 evaluation campaign.

pdf bib
psyML at SemEval-2018 Task 1 : Transfer Learning for Sentiment and Emotion AnalysisML at SemEval-2018 Task 1: Transfer Learning for Sentiment and Emotion Analysis
Grace Gee | Eugene Wang

In this paper, we describe the first attempt to perform transfer learning from sentiment to emotions. Our system employs Long Short-Term Memory (LSTM) networks, including bidirectional LSTM (biLSTM) and LSTM with attention mechanism. We perform transfer learning by first pre-training the LSTM networks on sentiment data before concatenating the penultimate layers of these networks into a single vector as input to new dense layers. For the E-c subtask, we utilize a novel approach to train models for correlated emotion classes. Our system performs 4/48, 3/39, 8/38, 4/37, 4/35 on all English subtasks EI-reg, EI-oc, V-reg, V-oc, E-c of SemEval 2018 Task 1 : Affect in Tweets.

pdf bib
UIUC at SemEval-2018 Task 1 : Recognizing Affect with Ensemble ModelsUIUC at SemEval-2018 Task 1: Recognizing Affect with Ensemble Models
Abhishek Avinash Narwekar | Roxana Girju

Our submission to the SemEval-2018 Task1 : Affect in Tweets shared task competition is a supervised learning model relying on standard lexicon features coupled with word embedding features. We used an ensemble of diverse models, including random forests, gradient boosted trees, and linear models, corrected for training-development set mismatch. We submitted the system’s output for subtasks 1 (emotion intensity prediction), 2 (emotion ordinal classification), 3 (valence intensity regression) and 4 (valence ordinal classification), for English tweets. We placed 25th, 19th, 24th and 15th in the four subtasks respectively. The baseline considered was an SVM (Support Vector Machines) model with linear kernel on the lexicon and embedding based features. Our system’s final performance measured in Pearson correlation scores outperformed the baseline by a margin of 2.2 % to 14.6 % across all tasks.

pdf bib
KU-MTL at SemEval-2018 Task 1 : Multi-task Identification of Affect in TweetsKU-MTL at SemEval-2018 Task 1: Multi-task Identification of Affect in Tweets
Thomas Nyegaard-Signori | Casper Veistrup Helms | Johannes Bjerva | Isabelle Augenstein

We take a multi-task learning approach to the shared Task 1 at SemEval-2018. The general idea concerning the model structure is to use as little external data as possible in order to preserve the task relatedness and reduce complexity. We employ multi-task learning with hard parameter sharing to exploit the relatedness between sub-tasks. As a base model, we use a standard recurrent neural network for both the classification and regression subtasks. Our system ranks 32nd out of 48 participants with a Pearson score of 0.557 in the first subtask, and 20th out of 35 in the fifth subtask with an accuracy score of 0.464.

pdf bib
EmoNLP at SemEval-2018 Task 2 : English Emoji Prediction with Gradient Boosting Regression Tree Method and Bidirectional LSTMEmoNLP at SemEval-2018 Task 2: English Emoji Prediction with Gradient Boosting Regression Tree Method and Bidirectional LSTM
Man Liu

This paper describes our system used in the English Emoji Prediction Task 2 at the SemEval-2018. Our system is based on two supervised machine learning algorithms : Gradient Boosting Regression Tree Method (GBM) and Bidirectional Long Short-term Memory Network (BLSTM). Besides the common features, we extract various lexicon and syntactic features from external resources. After comparing the results of two algorithms, GBM is chosen for the final evaluation.

pdf bib
UMDSub at SemEval-2018 Task 2 : Multilingual Emoji Prediction Multi-channel Convolutional Neural Network on Subword EmbeddingUMDSub at SemEval-2018 Task 2: Multilingual Emoji Prediction Multi-channel Convolutional Neural Network on Subword Embedding
Zhenduo Wang | Ted Pedersen

This paper describes the UMDSub system that participated in Task 2 of SemEval-2018. We developed a system that predicts an emoji given the raw text in a English tweet. The system is a Multi-channel Convolutional Neural Network based on subword embeddings for the representation of tweets. This model improves on character or word based methods by about 2 %. Our system placed 21st of 48 participating systems in the official evaluation.

pdf bib
UMDuluth-CS8761 at SemEval-2018 Task 2 : Emojis : Too many Choices?UMDuluth-CS8761 at SemEval-2018 Task 2: Emojis: Too many Choices?
Jonathan Beaulieu | Dennis Asamoah Owusu

In this paper, we present our system for assigning an emoji to a tweet based on the text. Each tweet was originally posted with an emoji which the task providers removed. Our task was to decide out of 20 emojis, which originally came with the tweet. Two datasets were provided-one in English and the other in Spanish. We treated the task as a standard classification task with the emojis as our classes and the tweets as our documents. Our best performing system used a Bag of Words model with a Linear Support Vector Machine as its’ classifier. We achieved a macro F1 score of 32.73 % for the English data and 17.98 % for the Spanish data.

pdf bib
THU_NGN at SemEval-2018 Task 2 : Residual CNN-LSTM Network with Attention for English Emoji PredictionTHU_NGN at SemEval-2018 Task 2: Residual CNN-LSTM Network with Attention for English Emoji Prediction
Chuhan Wu | Fangzhao Wu | Sixing Wu | Zhigang Yuan | Junxin Liu | Yongfeng Huang

Emojis are widely used by social media and social network users when posting their messages. It is important to study the relationships between messages and emojis. Thus, in SemEval-2018 Task 2 an interesting and challenging task is proposed, i.e., predicting which emojis are evoked by text-based tweets. We propose a residual CNN-LSTM with attention (RCLA) model for this task. Our model combines CNN and LSTM layers to capture both local and long-range contextual information for tweet representation. In addition, attention mechanism is used to select important components. Besides, residual connection is applied to CNN layers to facilitate the training of neural networks. We also incorporated additional features such as POS tags and sentiment features extracted from lexicons. Our model achieved 30.25 % macro-averaged F-score in the first subtask (i.e., emoji prediction in English), ranking 7th out of 48 participants.RCLA) model\n for this task. Our model combines CNN and LSTM layers to capture\n both local and long-range contextual information for tweet\n representation. In addition, attention mechanism is used to\n select important components. Besides, residual connection is\n applied to CNN layers to facilitate the training of neural\n networks. We also incorporated additional features such as POS\n tags and sentiment features extracted from lexicons. Our model\n achieved 30.25% macro-averaged F-score in the first subtask\n (i.e., emoji prediction in English), ranking 7th out of 48\n participants.\n

pdf bib
# TeamINF at SemEval-2018 Task 2 : Emoji Prediction in TweetsTeamINF at SemEval-2018 Task 2: Emoji Prediction in Tweets
Alison Ribeiro | Nádia Silva

In this paper, we describe a methodology to predict emoji in tweets. Our approach is based on the classic bag-of-words model in conjunction with word embeddings. The used classification algorithm was Logistic Regression. This architecture was used and evaluated in the context of the SemEval 2018 challenge (task 2, subtask 1).

pdf bib
EICA Team at SemEval-2018 Task 2 : Semantic and Metadata-based Features for Multilingual Emoji PredictionEICA Team at SemEval-2018 Task 2: Semantic and Metadata-based Features for Multilingual Emoji Prediction
Yufei Xie | Qingqing Song

The advent of social media has brought along a novel way of communication where meaning is composed by combining short text messages and visual enhancements, the so-called emojis. We describe our system for participating in SemEval-2018 Task 2 on Multilingual Emoji Prediction. Our approach relies on combining a rich set of various types of features : semantic and metadata. The most important types turned out to be the metadata feature. In subtask 1 : Emoji Prediction in English, our primary submission obtain a MAP of 16.45, Precision of 31.557, Recall of 16.771 and Accuracy of 30.992.

pdf bib
EmojiIt at SemEval-2018 Task 2 : An Effective Attention-Based Recurrent Neural Network Model for Emoji Prediction with Characters Gated WordsEmojiIt at SemEval-2018 Task 2: An Effective Attention-Based Recurrent Neural Network Model for Emoji Prediction with Characters Gated Words
Shiyun Chen | Maoquan Wang | Liang He

This paper presents our single model to Subtask 1 of SemEval 2018 Task 2 : Emoji Prediction in English. In order to predict the emoji that may be contained in a tweet, the basic model we use is an attention-based recurrent neural network which has achieved satisfactory performs in Natural Language processing. Considering the text comes from social media, it contains many discrepant abbreviations and online terms, we also combine word-level and character-level word vector embedding to better handling the words not appear in the vocabulary. Our single model1 achieved 29.50 % Macro F-score in test data and ranks 9th among 48 teams.

pdf bib
Peperomia at SemEval-2018 Task 2 : Vector Similarity Based Approach for Emoji PredictionSemEval-2018 Task 2: Vector Similarity Based Approach for Emoji Prediction
Jing Chen | Dechuan Yang | Xilian Li | Wei Chen | Tengjiao Wang

This paper describes our participation in SemEval 2018 Task 2 : Multilingual Emoji Prediction, in which participants are asked to predict a tweet’s most associated emoji from 20 emojis. Instead of regarding it as a 20-class classification problem we regard it as a text similarity problem. We propose a vector similarity based approach for this task. First the distributed representation (tweet vector) for each tweet is generated, then the similarity between this tweet vector and each emoji’s embedding is evaluated. The most similar emoji is chosen as the predicted label. Experimental results show that our approach performs comparably with the classification approach and shows its advantage in classifying emojis with similar semantic meaning.

pdf bib
NTUA-SLP at SemEval-2018 Task 2 : Predicting Emojis using RNNs with Context-aware AttentionNTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention
Christos Baziotis | Athanasiou Nikolaos | Athanasia Kolovou | Georgios Paraskevopoulos | Nikolaos Ellinas | Alexandros Potamianos

In this paper we present a deep-learning model that competed at SemEval-2018 Task 2 Multilingual Emoji Prediction. We participated in subtask A, in which we are called to predict the most likely associated emoji in English tweets. The proposed architecture relies on a Long Short-Term Memory network, augmented with an attention mechanism, that conditions the weight of each word, on a context vector which is taken as the aggregation of a tweet’s meaning. Moreover, we initialize the embedding layer of our model, with word2vec word embeddings, pretrained on a dataset of 550 million English tweets. Finally, our model does not rely on hand-crafted features or lexicons and is trained end-to-end with back-propagation. We ranked 2nd out of 48 teams.

pdf bib
Hatching Chick at SemEval-2018 Task 2 : Multilingual Emoji PredictionSemEval-2018 Task 2: Multilingual Emoji Prediction
Joël Coster | Reinder Gerard van Dalen | Nathalie Adriënne Jacqueline Stierman

As part of a SemEval 2018 shared task an attempt was made to build a system capable of predicting the occurence of a language’s most frequently used emoji in Tweets. Specifically, models for English and Spanish data were created and trained on 500.000 and 100.000 tweets respectively. In order to create these models, first a logistic regressor, a sequential LSTM, a random forest regressor and a SVM were tested. The latter was found to perform best and therefore optimized individually for both languages. During developmet f1-scores of 61 and 82 were obtained for English and Spanish data respectively, in comparison, f1-scores on the official evaluation data were 21 and 18. The significant decrease in performance during evaluation might be explained by overfitting during development and might therefore have partially be prevented by using cross-validation. Over all, emoji which occur in a very specific context such as a Christmas tree were found to be most predictable.

pdf bib
EPUTION at SemEval-2018 Task 2 : Emoji Prediction with User AdaptionEPUTION at SemEval-2018 Task 2: Emoji Prediction with User Adaption
Liyuan Zhou | Qiongkai Xu | Hanna Suominen | Tom Gedeon

This paper describes our approach, called EPUTION, for the open trial of the SemEval- 2018 Task 2, Multilingual Emoji Prediction. The task relates to using social media more precisely, Twitter with its aim to predict the most likely associated emoji of a tweet. Our solution for this text classification problem explores the idea of transfer learning for adapting the classifier based on users’ tweeting history. Our experiments show that our user-adaption method improves classification results by more than 6 per cent on the macro-averaged F1. Thus, our paper provides evidence for the rationality of enriching the original corpus longitudinally with user behaviors and transferring the lessons learned from corresponding users to specific instances.

pdf bib
PickleTeam ! at SemEval-2018 Task 2 : English and Spanish Emoji Prediction from TweetsPickleTeam! at SemEval-2018 Task 2: English and Spanish Emoji Prediction from Tweets
Daphne Groot | Rémon Kruizinga | Hennie Veldthuis | Simon de Wit | Hessel Haagsma

We present a system for emoji prediction on English and Spanish tweets, prepared for the SemEval-2018 task on Multilingual Emoji Prediction. We compared the performance of an SVM, LSTM and an ensemble of these two. We found the SVM performed best on our development set with an accuracy of 61.3 % for English and 83 % for Spanish. The features used for the SVM are lowercased word n-grams in the range of 1 to 20, tokenised by a TweetTokenizer and stripped of stop words. On the test set, our model achieved an accuracy of 34 % on English, with a slightly lower score of 29.7 % accuracy on Spanish.

pdf bib
YNU-HPCC at SemEval-2018 Task 2 : Multi-ensemble Bi-GRU Model with Attention Mechanism for Multilingual Emoji PredictionYNU-HPCC at SemEval-2018 Task 2: Multi-ensemble Bi-GRU Model with Attention Mechanism for Multilingual Emoji Prediction
Nan Wang | Jin Wang | Xuejie Zhang

This paper describes our approach to SemEval-2018 Task 2, which aims to predict the most likely associated emoji, given a tweet in English or Spanish. We normalized text-based tweets during pre-processing, following which we utilized a bi-directional gated recurrent unit with an attention mechanism to build our base model. Multi-models with or without class weights were trained for the ensemble methods. We boosted models without class weights, and only strong boost classifiers were identified. In our system, not only was a boosting method used, but we also took advantage of the voting ensemble method to enhance our final system result. Our method demonstrated an obvious improvement of approximately 3 % of the macro F1 score in English and 2 % in Spanish.

pdf bib
DUTH at SemEval-2018 Task 2 : Emoji Prediction in TweetsDUTH at SemEval-2018 Task 2: Emoji Prediction in Tweets
Dimitrios Effrosynidis | Georgios Peikos | Symeon Symeonidis | Avi Arampatzis

This paper describes the approach that was developed for SemEval 2018 Task 2 (Multilingual Emoji Prediction) by the DUTH Team. First, we employed a combination of pre-processing techniques to reduce the noise of tweets and produce a number of features. Then, we built several N-grams, to represent the combination of word and emojis. Finally, we trained our system with a tuned LinearSVC classifier. Our approach in the leaderboard ranked 18th amongst 48 teams.

pdf bib
Duluth UROP at SemEval-2018 Task 2 : Multilingual Emoji Prediction with Ensemble Learning and OversamplingDuluth UROP at SemEval-2018 Task 2: Multilingual Emoji Prediction with Ensemble Learning and Oversampling
Shuning Jin | Ted Pedersen

This paper describes the Duluth UROP systems that participated in SemEval2018 Task 2, Multilingual Emoji Prediction. We relied on a variety of ensembles made up of classifiers using Naive Bayes, Logistic Regression, and Random Forests. We used unigram and bigram features and tried to offset the skewness of the data through the use of oversampling. Our task evaluation results place us 19th of 48 systems in the English evaluation, and 5th of 21 in the Spanish. After the evaluation we realized that some simple changes to our pre-processing could significantly improve our results. After making these changes we attained results that would have placed us sixth in the English evaluation, and second in the Spanish.

pdf bib
CENNLP at SemEval-2018 Task 2 : Enhanced Distributed Representation of Text using Target Classes for Emoji Prediction RepresentationCENNLP at SemEval-2018 Task 2: Enhanced Distributed Representation of Text using Target Classes for Emoji Prediction Representation
Naveen J R | Hariharan V | Barathi Ganesh H. B. | Anand Kumar M | Soman K P

Emoji is one of the fastest growing language in pop-culture, especially in social media and it is very unlikely for its usage to decrease. These are generally used to bring an extra level of meaning to the texts, posted on social media platforms. Providing such an added info, gives more insights to the plain text, arising to hidden interpretation within the text. This paper explains our analysis on Task 2, Multilingual Emoji Prediction sharedtask conducted by Semeval-2018. In the task, a predicted emoji based on a piece of Twitter text are labelled under 20 different classes (most commonly used emojis) where these classes are learnt and further predicted are made for unseen Twitter text. In this work, we have experimented and analysed emojis predicted based on Twitter text, as a classification problem where the entailing emoji is considered as a label for every individual text data. We have implemented this using distributed representation of text through fastText. Also, we have made an effort to demonstrate how fastText framework can be useful in case of emoji prediction. This task is divide into two subtask, they are based on dataset presented in two different languages English and Spanish.

pdf bib
LIS at SemEval-2018 Task 2 : Mixing Word Embeddings and Bag of Features for Multilingual Emoji PredictionLIS at SemEval-2018 Task 2: Mixing Word Embeddings and Bag of Features for Multilingual Emoji Prediction
Gaël Guibon | Magalie Ochs | Patrice Bellot

In this paper we present the system submitted to the SemEval2018 task2 : Multilingual Emoji Prediction. Our system approaches both languages as being equal by first ; considering word embeddings associated to automatically computed features of different types, then by applying bagging algorithm RandomForest to predict the emoji of a tweet.

pdf bib
ALANIS at SemEval-2018 Task 3 : A Feature Engineering Approach to Irony Detection in English TweetsALANIS at SemEval-2018 Task 3: A Feature Engineering Approach to Irony Detection in English Tweets
Kevin Swanberg | Madiha Mirza | Ted Pedersen | Zhenduo Wang

This paper describes the ALANIS system that participated in Task 3 of SemEval-2018. We develop a system for detection of irony, as well as the detection of three types of irony : verbal polar irony, other verbal irony, and situational irony. The system uses a logistic regression model in subtask A and a voted classifier system with manually developed features to identify ironic tweets. This model improves on a naive bayes baseline by about 8 percent on training set.

pdf bib
UWB at SemEval-2018 Task 3 : Irony detection in English tweetsUWB at SemEval-2018 Task 3: Irony detection in English tweets
Tomáš Hercig

This paper describes our system created for the SemEval-2018 Task 3 : Irony detection in English tweets. Our strongly constrained system uses only the provided training data without any additional external resources. Our system is based on Maximum Entropy classifier and various features using parse tree, POS tags, and morphological features. Even without additional lexicons and word embeddings we achieved fourth place in Subtask A and seventh in Subtask B in terms of accuracy.

pdf bib
NIHRIO at SemEval-2018 Task 3 : A Simple and Accurate Neural Network Model for Irony Detection in TwitterNIHRIO at SemEval-2018 Task 3: A Simple and Accurate Neural Network Model for Irony Detection in Twitter
Thanh Vu | Dat Quoc Nguyen | Xuan-Son Vu | Dai Quoc Nguyen | Michael Catt | Michael Trenell

This paper describes our NIHRIO system for SemEval-2018 Task 3 Irony detection in English tweets. We propose to use a simple neural network architecture of Multilayer Perceptron with various types of input features including : lexical, syntactic, semantic and polarity features. Our system achieves very high performance in both subtasks of binary and multi-class irony detection in tweets. In particular, we rank at least fourth using the accuracy metric and sixth using the F1 metric. Our code is available at :https://github.com/NIHRIO/IronyDetectionInTwitter\n

pdf bib
LDR at SemEval-2018 Task 3 : A Low Dimensional Text Representation for Irony DetectionLDR at SemEval-2018 Task 3: A Low Dimensional Text Representation for Irony Detection
Bilal Ghanem | Francisco Rangel | Paolo Rosso

In this paper we describe our participation in the SemEval-2018 task 3 Shared Task on Irony Detection. We have approached the task with our low dimensionality representation method (LDR), which exploits low dimensional features extracted from text on the basis of the occurrence probability of the words depending on each class. Our intuition is that words in ironic texts have different probability of occurrence than in non-ironic ones. Our approach obtained acceptable results in both subtasks A and B. We have performed an error analysis that shows the difference on correct and incorrect classified tweets.

pdf bib
PunFields at SemEval-2018 Task 3 : Detecting Irony by Tools of Humor AnalysisPunFields at SemEval-2018 Task 3: Detecting Irony by Tools of Humor Analysis
Elena Mikhalkova | Yuri Karyakin | Alexander Voronov | Dmitry Grigoriev | Artem Leoznov

The paper describes our search for a universal algorithm of detecting intentional lexical ambiguity in different forms of creative language. At SemEval-2018 Task 3, we used PunFields, the system of automatic analysis of English puns that we introduced at SemEval-2017, to detect irony in tweets. Preliminary tests showed that it can reach the score of F1=0.596. However, at the competition, its result was F1=0.549.

pdf bib
HashCount at SemEval-2018 Task 3 : Concatenative Featurization of Tweet and Hashtags for Irony DetectionHashCount at SemEval-2018 Task 3: Concatenative Featurization of Tweet and Hashtags for Irony Detection
Won Ik Cho | Woo Hyun Kang | Nam Soo Kim

This paper proposes a novel feature extraction process for SemEval task 3 : Irony detection in English tweets. The proposed system incorporates a concatenative featurization of tweet and hashtags, which helps distinguishing between the irony-related and the other components. The system embeds tweets into a vector sequence with widely used pretrained word vectors, partially using a character embedding for the words that are out of vocabulary. Identification was performed with BiLSTM and CNN classifiers, achieving F1 score of 0.5939 (23/42) and 0.3925 (10/28) each for the binary and the multi-class case, respectively. The reliability of the proposed scheme was verified by analyzing the Gold test data, which demonstrates how hashtags can be taken into account when identifying various types of irony.

pdf bib
WLV at SemEval-2018 Task 3 : Dissecting Tweets in Search of IronyWLV at SemEval-2018 Task 3: Dissecting Tweets in Search of Irony
Omid Rohanian | Shiva Taslimipoor | Richard Evans | Ruslan Mitkov

This paper describes the systems submitted to SemEval 2018 Task 3 Irony detection in English tweets for both subtasks A and B. The first system leveraging a combination of sentiment, distributional semantic, and text surface features is ranked third among 44 teams according to the official leaderboard of the subtask A. The second system with slightly different representation of the features ranked ninth in subtask B. We present a method that entails decomposing tweets into separate parts. Searching for contrast within the constituents of a tweet is an integral part of our system. We embrace an extensive definition of contrast which leads to a vast coverage in detecting ironic content.

pdf bib
Irony Detector at SemEval-2018 Task 3 : Irony Detection in English Tweets using Word GraphSemEval-2018 Task 3: Irony Detection in English Tweets using Word Graph
Usman Ahmed | Lubna Zafar | Faiza Qayyum | Muhammad Arshad Islam

This paper describes the Irony detection system that participates in SemEval-2018 Task 3 : Irony detection in English tweets. The system participated in the subtasks A and B. This paper discusses the results of our system in the development, evaluation and post evaluation. Each class in the dataset is represented as directed unweighted graphs. Then, the comparison is carried out with each class graph which results in a vector. This vector is used as features by machine learning algorithm. The model is evaluated on a hold on strategy. The organizers randomly split 80 % (3,833 instances) training set (provided to the participant in training their system) and testing set 20%(958 instances). The test set is reserved to evaluate the performance of participants systems. During the evaluation, our system ranked 23 in the Coda Lab result of the subtask A (binary class problem). The binary class system achieves accuracy 0.6135, precision 0.5091, recall 0.7170 and F measure 0.5955. The subtask B (multi-class problem) system is ranked 22 in Coda Lab results. The multiclass model achieves the accuracy 0.4158, precision 0.4055, recall 0.3526 and f measure 0.3101.

pdf bib
Lancaster at SemEval-2018 Task 3 : Investigating Ironic Features in English TweetsLancaster at SemEval-2018 Task 3: Investigating Ironic Features in English Tweets
Edward Dearden | Alistair Baron

This paper describes the system we submitted to SemEval-2018 Task 3. The aim of the system is to distinguish between irony and non-irony in English tweets. We create a targeted feature set and analyse how different features are useful in the task of irony detection, achieving an F1-score of 0.5914. The analysis of individual features provides insight that may be useful in future attempts at detecting irony in tweets.

pdf bib
INAOE-UPV at SemEval-2018 Task 3 : An Ensemble Approach for Irony Detection in TwitterINAOE-UPV at SemEval-2018 Task 3: An Ensemble Approach for Irony Detection in Twitter
Delia Irazú Hernández Farías | Fernando Sánchez-Vega | Manuel Montes-y-Gómez | Paolo Rosso

This paper describes an ensemble approach to the SemEval-2018 Task 3. The proposed method is composed of two renowned methods in text classification together with a novel approach for capturing ironic content by exploiting a tailored lexicon for irony detection. We experimented with different ensemble settings. The obtained results show that our method has a good performance for detecting the presence of ironic content in Twitter.

pdf bib
KLUEnicorn at SemEval-2018 Task 3 : A Naive Approach to Irony DetectionKLUEnicorn at SemEval-2018 Task 3: A Naive Approach to Irony Detection
Luise Dürlich

This paper describes the KLUEnicorn system submitted to the SemEval-2018 task on Irony detection in English tweets. The proposed system uses a naive Bayes classifier to exploit rather simple lexical, pragmatical and semantical features as well as sentiment. It further takes a closer look at different adverb categories and named entities and factors in word-embedding information.

pdf bib
YNU-HPCC at SemEval-2018 Task 3 : Ensemble Neural Network Models for Irony Detection on TwitterYNU-HPCC at SemEval-2018 Task 3: Ensemble Neural Network Models for Irony Detection on Twitter
Bo Peng | Jin Wang | Xuejie Zhang

This paper describe the system we proposed to participate the first year of Irony detection in English tweets competition. Previous works demonstrate that LSTMs models have achieved remarkable performance in natural language processing ; besides, combining multiple classification from various individual classifiers in general is more powerful than a single classification. In order to obtain more precision classification of irony detection, our system trained several individual neural network classifiers and combined their results according to the ensemble-learning algorithm.

pdf bib
Binarizer at SemEval-2018 Task 3 : Parsing dependency and deep learning for irony detectionSemEval-2018 Task 3: Parsing dependency and deep learning for irony detection
Nishant Nikhil | Muktabh Mayank Srivastava

In this paper, we describe the system submitted for the SemEval 2018 Task 3 (Irony detection in English tweets) Subtask A by the team Binarizer. Irony detection is a key task for many natural language processing works. Our method treats ironical tweets to consist of smaller parts containing different emotions. We break down tweets into separate phrases using a dependency parser. We then embed those phrases using an LSTM-based neural network model which is pre-trained to predict emoticons for tweets. Finally, we train a fully-connected network to achieve classification.

pdf bib
ValenTO at SemEval-2018 Task 3 : Exploring the Role of Affective Content for Detecting Irony in English TweetsValenTO at SemEval-2018 Task 3: Exploring the Role of Affective Content for Detecting Irony in English Tweets
Delia Irazú Hernández Farías | Viviana Patti | Paolo Rosso

In this paper we describe the system used by the ValenTO team in the shared task on Irony Detection in English Tweets at SemEval 2018. The system takes as starting point emotIDM, an irony detection model that explores the use of affective features based on a wide range of lexical resources available for English, reflecting different facets of affect. We experimented with different settings, by exploiting different classifiers and features, and participated both to the binary irony detection task and to the task devoted to distinguish among different types of irony. We report on the results obtained by our system both in a constrained setting and unconstrained setting, where we explored the impact of using additional data in the training phase, such as corpora annotated for the presence of irony or sarcasm from the state of the art. Overall, the performance of our system seems to validate the important role that affective information has for identifying ironic content in Twitter.

pdf bib
NewsReader at SemEval-2018 Task 5 : Counting events by reasoning over event-centric-knowledge-graphsNewsReader at SemEval-2018 Task 5: Counting events by reasoning over event-centric-knowledge-graphs
Piek Vossen

In this paper, we describe the participation of the NewsReader system in the SemEval-2018 Task 5 on Counting Events and Participants in the Long Tail. NewsReader is a generic unsupervised text processing system that detects events with participants, time and place to generate Event Centric Knowledge Graphs (ECKGs). We minimally adapted these ECKGs to establish a baseline performance for the task. We first use the ECKGs to establish which documents report on the same incident and what event mentions are coreferential. Next, we aggregate ECKGs across coreferential mentions and use the aggregated knowledge to answer the questions of the task. Our participation tests the quality of NewsReader to create ECKGs, as well as the potential of ECKGs to establish event identity and reason over the result to answer the task queries.

pdf bib
SemEval-2018 Task 8 : Semantic Extraction from CybersecUrity REports using Natural Language Processing (SecureNLP)SemEval-2018 Task 8: Semantic Extraction from CybersecUrity REports using Natural Language Processing (SecureNLP)
Peter Phandi | Amila Silva | Wei Lu

This paper describes the SemEval 2018 shared task on semantic extraction from cybersecurity reports, which is introduced for the first time as a shared task on SemEval. This task comprises four SubTasks done incrementally to predict the characteristics of a specific malware using cybersecurity reports. To the best of our knowledge, we introduce the world’s largest publicly available dataset of annotated malware reports in this task. This task received in total 18 submissions from 9 participating teams.

pdf bib
DM_NLP at SemEval-2018 Task 8 : neural sequence labeling with linguistic featuresDM_NLP at SemEval-2018 Task 8: neural sequence labeling with linguistic features
Chunping Ma | Huafei Zheng | Pengjun Xie | Chen Li | Linlin Li | Luo Si

This paper describes our submissions for SemEval-2018 Task 8 : Semantic Extraction from CybersecUrity REports using NLP. The DM_NLP participated in two subtasks : SubTask 1 classifies if a sentence is useful for inferring malware actions and capabilities, and SubTask 2 predicts token labels (Action, Entity, Modifier and Others) for a given malware-related sentence. Since we leverage results of Subtask 2 directly to infer the result of Subtask 1, the paper focus on the system solving Subtask 2. By taking Subtask 2 as a sequence labeling task, our system relies on a recurrent neural network named BiLSTM-CNN-CRF with rich linguistic features, such as POS tags, dependency parsing labels, chunking labels, NER labels, Brown clustering. Our system achieved the highest F1 score in both token level and phrase level.

pdf bib
SemEval-2018 Task 9 : Hypernym DiscoverySemEval-2018 Task 9: Hypernym Discovery
Jose Camacho-Collados | Claudio Delli Bovi | Luis Espinosa-Anke | Sergio Oramas | Tommaso Pasini | Enrico Santus | Vered Shwartz | Roberto Navigli | Horacio Saggion

This paper describes the SemEval 2018 Shared Task on Hypernym Discovery. We put forward this task as a complementary benchmark for modeling hypernymy, a problem which has traditionally been cast as a binary classification task, taking a pair of candidate words as input. Instead, our reformulated task is defined as follows : given an input term, retrieve (or discover) its suitable hypernyms from a target corpus. We proposed five different subtasks covering three languages (English, Spanish, and Italian), and two specific domains of knowledge in English (Medical and Music). Participants were allowed to compete in any or all of the subtasks. Overall, a total of 11 teams participated, with a total of 39 different systems submitted through all subtasks. Data, results and further information about the task can be found at.https://competitions.codalab.org/competitions/17119.\n

pdf bib
SemEval-2018 Task 10 : Capturing Discriminative AttributesSemEval-2018 Task 10: Capturing Discriminative Attributes
Alicia Krebs | Alessandro Lenci | Denis Paperno

This paper describes the SemEval 2018 Task 10 on Capturing Discriminative Attributes. Participants were asked to identify whether an attribute could help discriminate between two concepts. For example, a successful system should determine that ‘urine’ is a discriminating feature in the word pair ‘kidney’, ‘bone’. The aim of the task is to better evaluate the capabilities of state of the art semantic models, beyond pure semantic similarity. The task attracted submissions from 21 teams, and the best system achieved a 0.75 F1 score.

pdf bib
SemEval-2018 Task 11 : Machine Comprehension Using Commonsense KnowledgeSemEval-2018 Task 11: Machine Comprehension Using Commonsense Knowledge
Simon Ostermann | Michael Roth | Ashutosh Modi | Stefan Thater | Manfred Pinkal

This report summarizes the results of the SemEval 2018 task on machine comprehension using commonsense knowledge. For this machine comprehension task, we created a new corpus, MCScript. It contains a high number of questions that require commonsense knowledge for finding the correct answer. 11 teams from 4 different countries participated in this shared task, most of them used neural approaches. The best performing system achieves an accuracy of 83.95 %, outperforming the baselines by a large margin, but still far from the human upper bound, which was found to be at 98 %.

pdf bib
Yuanfudao at SemEval-2018 Task 11 : Three-way Attention and Relational Knowledge for Commonsense Machine ComprehensionSemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension
Liang Wang | Meng Sun | Wei Zhao | Kewei Shen | Jingming Liu

This paper describes our system for SemEval-2018 Task 11 : Machine Comprehension using Commonsense Knowledge. We use Three-way Attentive Networks (TriAN) to model interactions between the passage, question and answers. To incorporate commonsense knowledge, we augment the input with relation embedding from the graph of general knowledge ConceptNet. As a result, our system achieves state-of-the-art performance with 83.95 % accuracy on the official test data. Code is publicly available at.https://github.com/intfloat/commonsense-rc.\n

pdf bib
SemEval-2018 Task 12 : The Argument Reasoning Comprehension TaskSemEval-2018 Task 12: The Argument Reasoning Comprehension Task
Ivan Habernal | Henning Wachsmuth | Iryna Gurevych | Benno Stein

A natural language argument is composed of a claim as well as reasons given as premises for the claim. The warrant explaining the reasoning is usually left implicit, as it is clear from the context and common sense. This makes a comprehension of arguments easy for humans but hard for machines. This paper summarizes the first shared task on argument reasoning comprehension. Given a premise and a claim along with some topic information, the goal was to automatically identify the correct warrant among two candidates that are plausible and lexically close, but in fact imply opposite claims. We describe the dataset with 1970 instances that we built for the task, and we outline the 21 computational approaches that participated, most of which used neural networks. The results reveal the complexity of the task, with many approaches hardly improving over the random accuracy of about 0.5. Still, the best observed accuracy (0.712) underlines the principle feasibility of identifying warrants. Our analysis indicates that an inclusion of external knowledge is key to reasoning comprehension.

pdf bib
GIST at SemEval-2018 Task 12 : A network transferring inference knowledge to Argument Reasoning Comprehension taskGIST at SemEval-2018 Task 12: A network transferring inference knowledge to Argument Reasoning Comprehension task
HongSeok Choi | Hyunju Lee

This paper describes our GIST team system that participated in SemEval-2018 Argument Reasoning Comprehension task (Task 12). Here, we address two challenging factors : unstated common senses and two lexically close warrants that lead to contradicting claims. A key idea for our system is full use of transfer learning from the Natural Language Inference (NLI) task to this task. We used Enhanced Sequential Inference Model (ESIM) to learn the NLI dataset. We describe how to use ESIM for transfer learning to choose correct warrant through a proposed system. We show comparable results through ablation experiments. Our system ranked 1st among 22 systems, outperforming all the systems more than 10 %.

pdf bib
LightRel at SemEval-2018 Task 7 : Lightweight and Fast Relation ClassificationLightRel at SemEval-2018 Task 7: Lightweight and Fast Relation Classification
Tyler Renslow | Günter Neumann

We present LightRel, a lightweight and fast relation classifier. Our goal is to develop a high baseline for different relation extraction tasks. By defining only very few data-internal, word-level features and external knowledge sources in the form of word clusters and word embeddings, we train a fast and simple linear classifier

pdf bib
OhioState at SemEval-2018 Task 7 : Exploiting Data Augmentation for Relation Classification in Scientific Papers Using Piecewise Convolutional Neural NetworksOhioState at SemEval-2018 Task 7: Exploiting Data Augmentation for Relation Classification in Scientific Papers Using Piecewise Convolutional Neural Networks
Dushyanta Dhyani

We describe our system for SemEval-2018 Shared Task on Semantic Relation Extraction and Classification in Scientific Papers where we focus on the Classification task. Our simple piecewise convolution neural encoder performs decently in an end to end manner. A simple inter-task data augmentation significantly boosts the performance of the model. Our best-performing systems stood 8th out of 20 teams on the classification task on noisy data and 12th out of 28 teams on the classification task on clean data.

pdf bib
UC3M-NII Team at SemEval-2018 Task 7 : Semantic Relation Classification in Scientific Papers via Convolutional Neural NetworkUC3M-NII Team at SemEval-2018 Task 7: Semantic Relation Classification in Scientific Papers via Convolutional Neural Network
Víctor Suárez-Paniagua | Isabel Segura-Bedmar | Akiko Aizawa

This paper reports our participation for SemEval-2018 Task 7 on extraction and classification of relationships between entities in scientific papers. Our approach is based on the use of a Convolutional Neural Network (CNN) trained on350 abstract with manually annotated entities and relations. Our hypothesis is that this deep learning model can be applied to extract and classify relations between entities for scientific papers at the same time. We use the Part-of-Speech and the distances to the target entities as part of the embedding for each word and we blind all the entities by marker names. In addition, we use sampling techniques to overcome the imbalance issues of this dataset. Our architecture obtained an F1-score of 35.4 % for the relation extraction task and 18.5 % for the relation classification task with a basic configuration of the one step CNN.

pdf bib
MIT-MEDG at SemEval-2018 Task 7 : Semantic Relation Classification via Convolution Neural NetworkMIT-MEDG at SemEval-2018 Task 7: Semantic Relation Classification via Convolution Neural Network
Di Jin | Franck Dernoncourt | Elena Sergeeva | Matthew McDermott | Geeticka Chauhan

SemEval 2018 Task 7 tasked participants to build a system to classify two entities within a sentence into one of the 6 possible relation types. We tested 3 classes of models : Linear classifiers, Long Short-Term Memory (LSTM) models, and Convolutional Neural Network (CNN) models. Ultimately, the CNN model class proved most performant, so we specialized to this model for our final submissions. We improved performance beyond a vanilla CNN by including a variant of negative sampling, using custom word embeddings learned over a corpus of ACL articles, training over corpora of both tasks 1.1 and 1.2, using reversed feature, using part of context words beyond the entity pairs and using ensemble methods to improve our final predictions. We also tested attention based pooling, up-sampling, and data augmentation, but none improved performance. Our model achieved rank 6 out of 28 (macro-averaged F1-score : 72.7) in subtask 1.1, and rank 4 out of 20 (macro F1 : 80.6) in subtask 1.2.

pdf bib
SIRIUS-LTG-UiO at SemEval-2018 Task 7 : Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific PapersSIRIUS-LTG-UiO at SemEval-2018 Task 7: Convolutional Neural Networks with Shortest Dependency Paths for Semantic Relation Extraction and Classification in Scientific Papers
Farhad Nooralahzadeh | Lilja Øvrelid | Jan Tore Lønning

This article presents the SIRIUS-LTG-UiO system for the SemEval 2018 Task 7 on Semantic Relation Extraction and Classification in Scientific Papers. First we extract the shortest dependency path (sdp) between two entities, then we introduce a convolutional neural network (CNN) which takes the shortest dependency path embeddings as input and performs relation classification with differing objectives for each subtask of the shared task. This approach achieved overall F1 scores of 76.7 and 83.2 for relation classification on clean and noisy data, respectively. Furthermore, for combined relation extraction and classification on clean data, it obtained F1 scores of 37.4 and 33.6 for each phase. Our system ranks 3rd in all three sub-tasks of the shared task.

pdf bib
Texterra at SemEval-2018 Task 7 : Exploiting Syntactic Information for Relation Extraction and Classification in Scientific PapersSemEval-2018 Task 7: Exploiting Syntactic Information for Relation Extraction and Classification in Scientific Papers
Andrey Sysoev | Vladimir Mayorov

In this work we evaluate applicability of entity pair models and neural network architectures for relation extraction and classification in scientific papers at SemEval-2018. We carry out experiments with representing entity pairs through sentence tokens and through shortest path in dependency tree, comparing approaches based on convolutional and recurrent neural networks. With convolutional network applied to shortest path in dependency tree we managed to be ranked eighth in subtask 1.1 (clean data), ninth in 1.2 (noisy data). Similar model applied to separate parts of the shortest path was mounted to ninth (extraction track) and seventh (classification track) positions in subtask 2 ranking.

pdf bib
UniMa at SemEval-2018 Task 7 : Semantic Relation Extraction and Classification from Scientific PublicationsUniMa at SemEval-2018 Task 7: Semantic Relation Extraction and Classification from Scientific Publications
Thorsten Keiper | Zhonghao Lyu | Sara Pooladzadeh | Yuan Xu | Jingyi Zhang | Anne Lauscher | Simone Paolo Ponzetto

Large repositories of scientific literature call for the development of robust methods to extract information from scholarly papers. This problem is addressed by the SemEval 2018 Task 7 on extracting and classifying relations found within scientific publications. In this paper, we present a feature-based and a deep learning-based approach to the task and discuss the results of the system runs that we submitted for evaluation.

pdf bib
ClaiRE at SemEval-2018 Task 7 : Classification of Relations using EmbeddingsClaiRE at SemEval-2018 Task 7: Classification of Relations using Embeddings
Lena Hettinger | Alexander Dallmann | Albin Zehe | Thomas Niebler | Andreas Hotho

In this paper we describe our system for SemEval-2018 Task 7 on classification of semantic relations in scientific literature for clean (subtask 1.1) and noisy data (subtask 1.2). We compare two models for classification, a C-LSTM which utilizes only word embeddings and an SVM that also takes handcrafted features into account. To adapt to the domain of science we train word embeddings on scientific papers collected from arXiv.org. The hand-crafted features consist of lexical features to model the semantic relations as well as the entities between which the relation holds. Classification of Relations using Embeddings (ClaiRE) achieved an F1 score of 74.89 % for the first subtask and 78.39 % for the second.

pdf bib
NTNU at SemEval-2018 Task 7 : Classifier Ensembling for Semantic Relation Identification and Classification in Scientific PapersNTNU at SemEval-2018 Task 7: Classifier Ensembling for Semantic Relation Identification and Classification in Scientific Papers
Biswanath Barik | Utpal Kumar Sikdar | Björn Gambäck

The paper presents NTNU’s contribution to SemEval-2018 Task 7 on relation identification and classification. The class weights and parameters of five alternative supervised classifiers were optimized through grid search and cross-validation. The outputs of the classifiers were combined through voting for the final prediction. A wide variety of features were explored, with the most informative identified by feature selection. The best setting achieved F1 scores of 47.4 % and 66.0 % in the relation classification subtasks 1.1 and 1.2. For relation identification and classification in subtask 2, it achieved F1 scores of 33.9 % and 17.0 %,

pdf bib
Talla at SemEval-2018 Task 7 : Hybrid Loss Optimization for Relation Classification using Convolutional Neural NetworksSemEval-2018 Task 7: Hybrid Loss Optimization for Relation Classification using Convolutional Neural Networks
Bhanu Pratap | Daniel Shank | Oladipo Ositelu | Byron Galbraith

This paper describes our approach to SemEval-2018 Task 7 given an entity-tagged text from the ACL Anthology corpus, identify and classify pairs of entities that have one of six possible semantic relationships. Our model consists of a convolutional neural network leveraging pre-trained word embeddings, unlabeled ACL-abstracts, and multiple window sizes to automatically learn useful features from entity-tagged sentences. We also experiment with a hybrid loss function, a combination of cross-entropy loss and ranking loss, to boost the separation in classification scores. Lastly, we include WordNet-based features to further improve the performance of our model. Our best model achieves an F1(macro) score of 74.2 and 84.8 on subtasks 1.1 and 1.2, respectively.

pdf bib
TeamDL at SemEval-2018 Task 8 : Cybersecurity Text Analysis using Convolutional Neural Network and Conditional Random FieldsTeamDL at SemEval-2018 Task 8: Cybersecurity Text Analysis using Convolutional Neural Network and Conditional Random Fields
Manikandan R | Krishna Madgula | Snehanshu Saha

In this work we present our participation to SemEval-2018 Task 8 subtasks 1 & 2 respectively. We developed Convolution Neural Network system for malware sentence classification (subtask 1) and Conditional Random Fields system for malware token label prediction (subtask 2). We experimented with couple of word embedding strategies, feature sets and achieved competitive performance across the two subtasks. For subtask 1 We experimented with two category of word embeddings namely native embeddings and task specific embedding using Word2vec and Glove algorithms. Native Embeddings : All words including the unknown ones that are randomly initialized use embeddings from original Word2vec / Glove models. Task specific : The embeddings are generated by training Word2vec / Glove algorithms on sentences from MalwareTextDB We found that glove outperforms rest of embeddings for subtask 1. For subtask 2, we used N-grams of size 6, previous, next tokens and labels, features giving disjunctions of words anywhere in the left or right, word shape features, word lemma of current, previous and next words, word-tag pair features, POS tags, prefix and suffixes.

pdf bib
HCCL at SemEval-2018 Task 8 : An End-to-End System for Sequence Labeling from Cybersecurity ReportsHCCL at SemEval-2018 Task 8: An End-to-End System for Sequence Labeling from Cybersecurity Reports
Mingming Fu | Xuemin Zhao | Yonghong Yan

This paper describes HCCL team systems that participated in SemEval 2018 Task 8 : SecureNLP (Semantic Extraction from cybersecurity reports using NLP). To solve the problem, our team applied a neural network architecture that benefits from both word and character level representaions automatically, by using combination of Bi-directional LSTM, CNN and CRF (Ma and Hovy, 2016). Our system is truly end-to-end, requiring no feature engineering or data preprocessing, and we ranked 4th in the subtask 1, 7th in the subtask2 and 3rd in the SubTask2-relaxed.

pdf bib
UMBC at SemEval-2018 Task 8 : Understanding Text about MalwareUMBC at SemEval-2018 Task 8: Understanding Text about Malware
Ankur Padia | Arpita Roy | Taneeya Satyapanich | Francis Ferraro | Shimei Pan | Youngja Park | Anupam Joshi | Tim Finin

We describe the systems developed by the UMBC team for 2018 SemEval Task 8, SecureNLP (Semantic Extraction from CybersecUrity REports using Natural Language Processing). We participated in three of the sub-tasks : (1) classifying sentences as being relevant or irrelevant to malware, (2) predicting token labels for sentences, and (4) predicting attribute labels from the Malware Attribute Enumeration and Characterization vocabulary for defining malware characteristics. We achieve F1 score of 50.34/18.0 (dev / test), 22.23 (test-data), and 31.98 (test-data) for Task1, Task2 and Task2 respectively. We also make our cybersecurity embeddings publicly available at.http://bit.ly/cyber2vec.\n

pdf bib
Villani at SemEval-2018 Task 8 : Semantic Extraction from Cybersecurity Reports using Representation LearningSemEval-2018 Task 8: Semantic Extraction from Cybersecurity Reports using Representation Learning
Pablo Loyola | Kugamoorthy Gajananan | Yuji Watanabe | Fumiko Satoh

In this paper, we describe our proposal for the task of Semantic Extraction from Cybersecurity Reports. The goal is to explore if natural language processing methods can provide relevant and actionable knowledge to contribute to better understand malicious behavior. Our method consists of an attention-based Bi-LSTM which achieved competitive performance of 0.57 for the Subtask 1. In the due process we also present ablation studies across multiple embeddings and their level of representation and also report the strategies we used to mitigate the extreme imbalance between classes.

pdf bib
Digital Operatives at SemEval-2018 Task 8 : Using dependency features for malware NLPSemEval-2018 Task 8: Using dependency features for malware NLP
Chris Brew

The four sub-tasks of SecureNLP build towards a capability for quickly highlighting critical information from malware reports, such as the specific actions taken by a malware sample. Digital Operatives (DO) submitted to sub-tasks 1 and 2, using standard text analysis technology (text classification for sub-task 1, and a CRF for sub-task 2). Performance is broadly competitive with other submitted systems on sub-task 1 and weak on sub-task 2. The annotation guidelines for the intermediate sub-tasks create a linkage to the final task, which is both an annotation challenge and a potentially useful feature of the task. The methods that DO chose do not attempt to make use of this linkage, which may be a missed opportunity. This motivates a post-hoc error analysis. It appears that the annotation task is very hard, and that in some cases both deep conceptual knowledge and substantial surrounding context are needed in order to correctly classify sentences.

pdf bib
SJTU-NLP at SemEval-2018 Task 9 : Neural Hypernym Discovery with Term EmbeddingsSJTU-NLP at SemEval-2018 Task 9: Neural Hypernym Discovery with Term Embeddings
Zhuosheng Zhang | Jiangtong Li | Hai Zhao | Bingjie Tang

This paper describes a hypernym discovery system for our participation in the SemEval-2018 Task 9, which aims to discover the best (set of) candidate hypernyms for input concepts or entities, given the search space of a pre-defined vocabulary. We introduce a neural network architecture for the concerned task and empirically study various neural network models to build the representations in latent space for words and phrases. The evaluated models include convolutional neural network, long-short term memory network, gated recurrent unit and recurrent convolutional neural network. We also explore different embedding methods, including word embedding and sense embedding for better performance.

pdf bib
NLP_HZ at SemEval-2018 Task 9 : a Nearest Neighbor ApproachNLP_HZ at SemEval-2018 Task 9: a Nearest Neighbor Approach
Wei Qiu | Mosha Chen | Linlin Li | Luo Si

Hypernym discovery aims to discover the hypernym word sets given a hyponym word and proper corpus. This paper proposes a simple but effective method for the discovery of hypernym sets based on word embedding, which can be used to measure the contextual similarities between words. Given a test hyponym word, we get its hypernym lists by computing the similarities between the hyponym word and words in the training data, and fill the test word’s hypernym lists with the hypernym list in the training set of the nearest similarity distance to the test word. In SemEval 2018 task9, our results, achieve 1st on Spanish, 2nd on Italian, 6th on English in the metric of MAP.

pdf bib
UMDuluth-CS8761 at SemEval-2018 Task9 : Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word EmbeddingsUMDuluth-CS8761 at SemEval-2018 Task9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings
Arshia Zernab Hassan | Manikya Swathi Vallabhajosyula | Ted Pedersen

Hypernym Discovery is the task of identifying potential hypernyms for a given term. A hypernym is a more generalized word that is super-ordinate to more specific words. This paper explores several approaches that rely on co-occurrence frequencies of word pairs, Hearst Patterns based on regular expressions, and word embeddings created from the UMBC corpus. Our system Babbage participated in Subtask 1A for English and placed 6th of 19 systems when identifying concept hypernyms, and 12th of 18 systems for entity hypernyms.

pdf bib
EXPR at SemEval-2018 Task 9 : A Combined Approach for Hypernym DiscoveryEXPR at SemEval-2018 Task 9: A Combined Approach for Hypernym Discovery
Ahmad Issa Alaa Aldine | Mounira Harzallah | Giuseppe Berio | Nicolas Béchet | Ahmad Faour

In this paper, we present our proposed system (EXPR) to participate in the hypernym discovery task of SemEval 2018. The task addresses the challenge of discovering hypernym relations from a text corpus. Our proposal is a combined approach of path-based technique and distributional technique. We use dependency parser on a corpus to extract candidate hypernyms and represent their dependency paths as a feature vector. The feature vector is concatenated with a feature vector obtained using Wikipedia pre-trained term embedding model. The concatenated feature vector fits a supervised machine learning method to learn a classifier model. This model is able to classify new candidate hypernyms as hypernym or not. Our system performs well to discover new hypernyms not defined in gold hypernyms.

pdf bib
Meaning_space at SemEval-2018 Task 10 : Combining explicitly encoded knowledge with information extracted from word embeddingsSemEval-2018 Task 10: Combining explicitly encoded knowledge with information extracted from word embeddings
Pia Sommerauer | Antske Fokkens | Piek Vossen

This paper presents the two systems submitted by the meaning space team in Task 10 of the SemEval competition 2018 entitled Capturing discriminative attributes. The systems consist of combinations of approaches exploiting explicitly encoded knowledge about concepts in WordNet and information encoded in distributional semantic vectors. Rather than aiming for high performance, we explore which kind of semantic knowledge is best captured by different methods. The results indicate that WordNet glosses on different levels of the hierarchy capture many attributes relevant for this task. In combination with exploiting word embedding similarities, this source of information yielded our best results. Our best performing system ranked 5th out of 13 final ranks. Our analysis yields insights into the different kinds of attributes represented by different sources of knowledge.

pdf bib
CitiusNLP at SemEval-2018 Task 10 : The Use of Transparent Distributional Models and Salient Contexts to Discriminate Word AttributesCitiusNLP at SemEval-2018 Task 10: The Use of Transparent Distributional Models and Salient Contexts to Discriminate Word Attributes
Pablo Gamallo

This article describes the unsupervised strategy submitted by the CitiusNLP team to the SemEval 2018 Task 10, a task which consists of predict whether a word is a discriminative attribute between two other words. Our strategy relies on the correspondence between discriminative attributes and relevant contexts of a word. More precisely, the method uses transparent distributional models to extract salient contexts of words which are identified as discriminative attributes. The system performance reaches about 70 % accuracy when it is applied on the development dataset, but its accuracy goes down (63 %) on the official test dataset.

pdf bib
THU_NGN at SemEval-2018 Task 10 : Capturing Discriminative Attributes with MLP-CNN modelTHU_NGN at SemEval-2018 Task 10: Capturing Discriminative Attributes with MLP-CNN model
Chuhan Wu | Fangzhao Wu | Sixing Wu | Zhigang Yuan | Yongfeng Huang

Existing semantic models are capable of identifying the semantic similarity of words. However, it’s hard for these models to discriminate between a word and another similar word. Thus, the aim of SemEval-2018 Task 10 is to predict whether a word is a discriminative attribute between two concepts. In this task, we apply a multilayer perceptron (MLP)-convolutional neural network (CNN) model to identify whether an attribute is discriminative. The CNNs are used to extract low-level features from the inputs. The MLP takes both the flatten CNN maps and inputs to predict the labels. The evaluation F-score of our system on the test set is 0.629 (ranked 15th), which indicates that our system still needs to be improved. However, the behaviours of our system in our experiments provide useful information, which can help to improve the collective understanding of this novel task.

pdf bib
ALB at SemEval-2018 Task 10 : A System for Capturing Discriminative AttributesALB at SemEval-2018 Task 10: A System for Capturing Discriminative Attributes
Bogdan Dumitru | Alina Maria Ciobanu | Liviu P. Dinu

Semantic difference detection attempts to capture whether a word is a discriminative attribute between two other words. For example, the discriminative feature red characterizes the first word from the (apple, banana) pair, but not the second. Modeling semantic difference is essential for language understanding systems, as it provides useful information for identifying particular aspects of word senses. This paper describes our system implementation (the ALB system of the NLP@Unibuc team) for the 10th task of the SemEval 2018 workshop, Capturing Discriminative Attributes. We propose a method for semantic difference detection that uses an SVM classifier with features based on co-occurrence counts and shallow semantic parsing, achieving 0.63 F1 score in the competition.

pdf bib
ELiRF-UPV at SemEval-2018 Task 10 : Capturing Discriminative Attributes with Knowledge Graphs and WikipediaELiRF-UPV at SemEval-2018 Task 10: Capturing Discriminative Attributes with Knowledge Graphs and Wikipedia
José-Ángel González | Lluís-F. Hurtado | Encarna Segarra | Ferran Pla

This paper describes the participation of ELiRF-UPV team at task 10, Capturing Discriminative Attributes, of SemEval-2018. Our best approach consists of using ConceptNet, Wikipedia and NumberBatch embeddings in order to stablish relationships between concepts and attributes. Furthermore, this system achieves competitive results in the official evaluation.

pdf bib
Wolves at SemEval-2018 Task 10 : Semantic Discrimination based on Knowledge and AssociationSemEval-2018 Task 10: Semantic Discrimination based on Knowledge and Association
Shiva Taslimipoor | Omid Rohanian | Le An Ha | Gloria Corpas Pastor | Ruslan Mitkov

This paper describes the system submitted to SemEval 2018 shared task 10 ‘Capturing Dicriminative Attributes’. We use a combination of knowledge-based and co-occurrence features to capture the semantic difference between two words in relation to an attribute. We define scores based on association measures, ngram counts, word similarity, and ConceptNet relations. The system is ranked 4th (joint) on the official leaderboard of the task.

pdf bib
Luminoso at SemEval-2018 Task 10 : Distinguishing Attributes Using Text Corpora and Relational KnowledgeLuminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge
Robyn Speer | Joanna Lowry-Duda

Luminoso participated in the SemEval 2018 task on Capturing Discriminative Attributes with a system based on ConceptNet, an open knowledge graph focused on general knowledge. In this paper, we describe how we trained a linear classifier on a small number of semantically-informed features to achieve an F1 score of 0.7368 on the task, close to the task’s high score of 0.75.

pdf bib
BomJi at SemEval-2018 Task 10 : Combining Vector-, Pattern- and Graph-based Information to Identify Discriminative AttributesBomJi at SemEval-2018 Task 10: Combining Vector-, Pattern- and Graph-based Information to Identify Discriminative Attributes
Enrico Santus | Chris Biemann | Emmanuele Chersoni

This paper describes BomJi, a supervised system for capturing discriminative attributes in word pairs (e.g. yellow as discriminative for banana over watermelon). The system relies on an XGB classifier trained on carefully engineered graph-, pattern- and word embedding-based features. It participated in the SemEval-2018 Task 10 on Capturing Discriminative Attributes, achieving an F1 score of 0.73 and ranking 2nd out of 26 participant systems.

pdf bib
Igevorse at SemEval-2018 Task 10 : Exploring an Impact of Word Embeddings Concatenation for Capturing Discriminative AttributesSemEval-2018 Task 10: Exploring an Impact of Word Embeddings Concatenation for Capturing Discriminative Attributes
Maxim Grishin

This paper presents a comparison of several approaches for capturing discriminative attributes and considers an impact of concatenation of several word embeddings of different nature on the classification performance. A similarity-based method is proposed and compared with classical machine learning approaches. It is shown that this method outperforms others on all the considered word vector models and there is a performance increase when concatenated datasets are used.

pdf bib
AmritaNLP at SemEval-2018 Task 10 : Capturing discriminative attributes using convolution neural network over global vector representation.AmritaNLP at SemEval-2018 Task 10: Capturing discriminative attributes using convolution neural network over global vector representation.
Vivek Vinayan | Anand Kumar M | Soman K P

The Capturing Discriminative Attributes sharedtask is the tenth task, conjoint with SemEval2018. The task is to predict if a word can capture distinguishing attributes of one word from another. We use GloVe word embedding, pre-trained on openly sourced corpus for this task. A base representation is initially established over varied dimensions. These representations are evaluated based on validation scores over two models, first on an SVM based classifier and second on a one dimension CNN model. The scores are used to further develop the representation with vector combinations, by considering various distance measures. These measures correspond to offset vectors which are concatenated as features, mainly to improve upon the F1score, with the best accuracy. The features are then further tuned on the validation scores, to achieve highest F1score. Our evaluation narrowed down to two representations, classified on CNN models, having a total dimension length of 1204 & 1203 for the final submissions. Of the two, the latter feature representation delivered our best F1score of 0.658024 (as per result).

pdf bib
Discriminator at SemEval-2018 Task 10 : Minimally Supervised DiscriminationSemEval-2018 Task 10: Minimally Supervised Discrimination
Artur Kulmizev | Mostafa Abdou | Vinit Ravishankar | Malvina Nissim

We participated to the SemEval-2018 shared task on capturing discriminative attributes (Task 10) with a simple system that ranked 8th amongst the 26 teams that took part in the evaluation. Our final score was 0.67, which is competitive with the winning score of 0.75, particularly given that our system is a zero-shot system that requires no training and minimal parameter optimisation. In addition to describing the submitted system, and discussing the implications of the relative success of such a system on this task, we also report on other, more complex models we experimented with.

pdf bib
UMD at SemEval-2018 Task 10 : Can Word Embeddings Capture Discriminative Attributes?UMD at SemEval-2018 Task 10: Can Word Embeddings Capture Discriminative Attributes?
Alexander Zhang | Marine Carpuat

We describe the University of Maryland’s submission to SemEval-018 Task 10, Capturing Discriminative Attributes : given word triples (w1, w2, d), the goal is to determine whether d is a discriminating attribute belonging to w1 but not w2. Our study aims to determine whether word embeddings can address this challenging task. Our submission casts this problem as supervised binary classification using only word embedding features. Using a gaussian SVM model trained only on validation data results in an F-score of 60 %. We also show that cosine similarity features are more effective, both in unsupervised systems (F-score of 65 %) and supervised systems (F-score of 67 %).

pdf bib
NTU NLP Lab System at SemEval-2018 Task 10 : Verifying Semantic Differences by Integrating Distributional Information and Expert KnowledgeNTU NLP Lab System at SemEval-2018 Task 10: Verifying Semantic Differences by Integrating Distributional Information and Expert Knowledge
Yow-Ting Shiue | Hen-Hsen Huang | Hsin-Hsi Chen

This paper presents the NTU NLP Lab system for the SemEval-2018 Capturing Discriminative Attributes task. Word embeddings, pointwise mutual information (PMI), ConceptNet edges and shortest path lengths are utilized as input features to build binary classifiers to tell whether an attribute is discriminative for a pair of concepts. Our neural network model reaches about 73 % F1 score on the test set and ranks the 3rd in the task. Though the attributes to deal with in this task are all visual, our models are not provided with any image data. The results indicate that visual information can be derived from textual data.

pdf bib
ELiRF-UPV at SemEval-2018 Task 11 : Machine Comprehension using Commonsense KnowledgeELiRF-UPV at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge
José-Ángel González | Lluís-F. Hurtado | Encarna Segarra | Ferran Pla

This paper describes the participation of ELiRF-UPV team at task 11, Machine Comprehension using Commonsense Knowledge, of SemEval-2018. Our approach is based on the use of word embeddings, NumberBatch Embeddings, and a Deep Learning architecture to find the best answer for the multiple-choice questions based on the narrative text. The results obtained are in line with those obtained by the other participants and they encourage us to continue working on this problem.

pdf bib
YNU_Deep at SemEval-2018 Task 11 : An Ensemble of Attention-based BiLSTM Models for Machine ComprehensionYNU_Deep at SemEval-2018 Task 11: An Ensemble of Attention-based BiLSTM Models for Machine Comprehension
Peng Ding | Xiaobing Zhou

We firstly use GloVe to learn the distributed representations automatically from the instance, question and answer triples. Then an attentionbased Bidirectional LSTM (BiLSTM) model is used to encode the triples. We also perform a simple ensemble method to improve the effectiveness of our model. The system we developed obtains an encouraging result on this task. It achieves the accuracy 0.7472 on the test set. We rank 5th according to the official ranking.

pdf bib
CSReader at SemEval-2018 Task 11 : Multiple Choice Question Answering as Textual EntailmentCSReader at SemEval-2018 Task 11: Multiple Choice Question Answering as Textual Entailment
Zhengping Jiang | Qi Sun

In this document we present an end-to-end machine reading comprehension system that solves multiple choice questions with a textual entailment perspective. Since some of the knowledge required is not explicitly mentioned in the text, we try to exploit commonsense knowledge by using pretrained word embeddings during contextual embeddings and by dynamically generating a weighted representation of related script knowledge. In the model two kinds of prediction structure are ensembled, and the final accuracy of our system is 10 percent higher than the naiive baseline.

pdf bib
YNU-HPCC at Semeval-2018 Task 11 : Using an Attention-based CNN-LSTM for Machine Comprehension using Commonsense KnowledgeYNU-HPCC at Semeval-2018 Task 11: Using an Attention-based CNN-LSTM for Machine Comprehension using Commonsense Knowledge
Hang Yuan | Jin Wang | Xuejie Zhang

This shared task is a typical question answering task. Compared with the normal question and answer system, it needs to give the answer to the question based on the text provided. The essence of the problem is actually reading comprehension. Typically, there are several questions for each text that correspond to it. And for each question, there are two candidate answers (and only one of them is correct). To solve this problem, the usual approach is to use convolutional neural networks (CNN) and recurrent neural network (RNN) or their improved models (such as long short-term memory (LSTM)). In this paper, an attention-based CNN-LSTM model is proposed for this task. By adding an attention mechanism and combining the two models, this experimental result has been significantly improved.

pdf bib
Jiangnan at SemEval-2018 Task 11 : Deep Neural Network with Attention Method for Machine Comprehension TaskSemEval-2018 Task 11: Deep Neural Network with Attention Method for Machine Comprehension Task
Jiangnan Xia

This paper describes our submission for the International Workshop on Semantic Evaluation (SemEval-2018) shared task 11 Machine Comprehension using Commonsense Knowledge (Ostermann et al., 2018b). We use a deep neural network model to choose the correct answer from the candidate answers pair when the document and question are given. The interactions between document, question and answers are modeled by attention mechanism and a variety of manual features are used to improve model performance. We also use CoVe (McCann et al., 2017) as an external source of knowledge which is not mentioned in the document. As a result, our system achieves 80.91 % accuracy on the test data, which is on the third place of the leaderboard.

pdf bib
Lyb3b at SemEval-2018 Task 11 : Machine Comprehension Task using Deep Learning ModelsSemEval-2018 Task 11: Machine Comprehension Task using Deep Learning Models
Yongbin Li | Xiaobing Zhou

Machine Comprehension of text is a typical Natural Language Processing task which remains an elusive challenge. This paper is to solve the task 11 of SemEval-2018, Machine Comprehension using Commonsense Knowledge task. We use deep learning model to solve the problem. We build distributed word embedding of text, question and answering respectively instead of manually extracting features by linguistic tools. Meanwhile, we use a series of frameworks such as CNN model, LSTM model, LSTM with attention model and biLSTM with attention model for processing word vector. Experiments demonstrate the superior performance of biLSTM with attention framework compared to other models. We also delete high frequency words and combine word vector and data augmentation methods, achieved a certain effect. The approach we proposed rank 6th in official results, with accuracy rate of 0.7437 in test dataset.

pdf bib
MITRE at SemEval-2018 Task 11 : Commonsense Reasoning without Commonsense KnowledgeMITRE at SemEval-2018 Task 11: Commonsense Reasoning without Commonsense Knowledge
Elizabeth Merkhofer | John Henderson | David Bloom | Laura Strickhart | Guido Zarrella

This paper describes MITRE’s participation in SemEval-2018 Task 11 : Machine Comprehension using Commonsense Knowledge. The techniques explored range from simple bag-of-ngrams classifiers to neural architectures with varied attention and alignment mechanisms. Logistic regression ties the systems together into an ensemble submitted for evaluation. The resulting system answers reading comprehension questions with 82.27 % accuracy.

pdf bib
SNU_IDS at SemEval-2018 Task 12 : Sentence Encoder with Contextualized Vectors for Argument Reasoning ComprehensionSNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
Taeuk Kim | Jihun Choi | Sang-goo Lee

We present a novel neural architecture for the Argument Reasoning Comprehension task of SemEval 2018. It is a simple neural network consisting of three parts, collectively judging whether the logic built on a set of given sentences (a claim, reason, and warrant) is plausible or not. The model utilizes contextualized word vectors pre-trained on large machine translation (MT) datasets as a form of transfer learning, which can help to mitigate the lack of training data. Quantitative analysis shows that simply leveraging LSTMs trained on MT datasets outperforms several baselines and non-transferred models, achieving accuracies of about 70 % on the development set and about 60 % on the test set.

pdf bib
ITNLP-ARC at SemEval-2018 Task 12 : Argument Reasoning Comprehension with AttentionITNLP-ARC at SemEval-2018 Task 12: Argument Reasoning Comprehension with Attention
Wenjie Liu | Chengjie Sun | Lei Lin | Bingquan Liu

Reasoning is a very important topic and has many important applications in the field of natural language processing. Semantic Evaluation (SemEval) 2018 Task 12 The Argument Reasoning Comprehension committed to research natural language reasoning. In this task, we proposed a novel argument reasoning comprehension system, ITNLP-ARC, which use Neural Networks technology to solve this problem. In our system, the LSTM model is involved to encode both the premise sentences and the warrant sentences. The attention model is used to merge the two premise sentence vectors. Through comparing the similarity between the attention vector and each of the two warrant vectors, we choose the one with higher similarity as our system’s final answer.

pdf bib
ECNU at SemEval-2018 Task 12 : An End-to-End Attention-based Neural Network for the Argument Reasoning Comprehension TaskECNU at SemEval-2018 Task 12: An End-to-End Attention-based Neural Network for the Argument Reasoning Comprehension Task
Junfeng Tian | Man Lan | Yuanbin Wu

This paper presents our submissions to SemEval 2018 Task 12 : the Argument Reasoning Comprehension Task. We investigate an end-to-end attention-based neural network to represent the two lexically close candidate warrants. On the one hand, we extract their different parts as attention vectors to obtain distinguishable representations. On the other hand, we use their surrounds (i.e., claim, reason, debate context) as another attention vectors to get contextual representations, which work as final clues to select the correct warrant. Our model achieves 60.4 % accuracy and ranks 3rd among 22 participating systems.

pdf bib
NLITrans at SemEval-2018 Task 12 : Transfer of Semantic Knowledge for Argument ComprehensionNLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument Comprehension
Timothy Niven | Hung-Yu Kao

The Argument Reasoning Comprehension Task is a difficult challenge requiring significant language understanding and complex reasoning over world knowledge. We focus on transfer of a sentence encoder to bootstrap more complicated architectures given the small size of the dataset. Our best model uses a pre-trained BiLSTM to encode input sentences, learns task-specific features for the argument and warrants, then performs independent argument-warrant matching. This model achieves mean test set accuracy of 61.31 %. Encoder transfer yields a significant gain to our best model over random initialization. Sharing parameters for independent warrant evaluation provides regularization and effectively doubles the size of the dataset. We demonstrate that regularization comes from ignoring statistical correlations between warrant positions. We also report an experiment with our best model that only matches warrants to reasons, ignoring claims. Performance is still competitive, suggesting that our model is not necessarily learning the intended task.

pdf bib
BLCU_NLP at SemEval-2018 Task 12 : An Ensemble Model for Argument Reasoning Based on Hierarchical AttentionBLCU_NLP at SemEval-2018 Task 12: An Ensemble Model for Argument Reasoning Based on Hierarchical Attention
Meiqian Zhao | Chunhua Liu | Lu Liu | Yan Zhao | Dong Yu

To comprehend an argument and fill the gap between claims and reasons, it is vital to find the implicit supporting warrants behind. In this paper, we propose a hierarchical attention model to identify the right warrant which explains why the reason stands for the claim. Our model focuses not only on the similar part between warrants and other information but also on the contradictory part between two opposing warrants. In addition, we use the ensemble method for different models. Our model achieves an accuracy of 61 %, ranking second in this task. Experimental results demonstrate that our model is effective to make correct choices.

pdf bib
YNU Deep at SemEval-2018 Task 12 : A BiLSTM Model with Neural Attention for Argument Reasoning ComprehensionYNU Deep at SemEval-2018 Task 12: A BiLSTM Model with Neural Attention for Argument Reasoning Comprehension
Peng Ding | Xiaobing Zhou

This paper describes the system submitted to SemEval-2018 Task 12 (The Argument Reasoning Comprehension Task). Enabling a computer to understand a text so that it can answer comprehension questions is still a challenging goal of NLP. We propose a Bidirectional LSTM (BiLSTM) model that reads two sentences separated by a delimiter to determine which warrant is correct. We extend this model with a neural attention mechanism that encourages the model to make reasoning over the given claims and reasons. Officially released results show that our system ranks 6th among 22 submissions to this task.

pdf bib
UniMelb at SemEval-2018 Task 12 : Generative Implication using LSTMs, Siamese Networks and Semantic Representations with Synonym FuzzingUniMelb at SemEval-2018 Task 12: Generative Implication using LSTMs, Siamese Networks and Semantic Representations with Synonym Fuzzing
Anirudh Joshi | Tim Baldwin | Richard O. Sinnott | Cecile Paris

This paper describes a warrant classification system for SemEval 2018 Task 12, that attempts to learn semantic representations of reasons, claims and warrants. The system consists of 3 stacked LSTMs : one for the reason, one for the claim, and one shared Siamese Network for the 2 candidate warrants. Our main contribution is to force the embeddings into a shared feature space using vector operations, semantic similarity classification, Siamese networks, and multi-task learning. In doing so, we learn a form of generative implication, in encoding implication interrelationships between reasons, claims, and the associated correct and incorrect warrants. We augment the limited data in the task further by utilizing WordNet synonym fuzzing. When applied to SemEval 2018 Task 12, our system performs well on the development data, and officially ranked 8th among 21 teams.

pdf bib
Joker at SemEval-2018 Task 12 : The Argument Reasoning Comprehension with Neural AttentionSemEval-2018 Task 12: The Argument Reasoning Comprehension with Neural Attention
Guobin Sui | Wenhan Chao | Zhunchen Luo

This paper describes a classification system that participated in the SemEval-2018 Task 12 : The Argument Reasoning Comprehension Task. Briefly the task can be described as that a natural language argument is what we have, with reason, claim, and correct and incorrect warrants, and we need to choose the correct warrant. In order to make fully understand of the semantic information of the sentences, we proposed a neural network architecture with attention mechanism to achieve this goal. Besides we try to introduce keywords into the model to improve accuracy. Finally the proposed system achieved 5th place among 22 participating systems

pdf bib
TRANSRW at SemEval-2018 Task 12 : Transforming Semantic Representations for Argument Reasoning ComprehensionTRANSRW at SemEval-2018 Task 12: Transforming Semantic Representations for Argument Reasoning Comprehension
Zhimin Chen | Wei Song | Lizhen Liu

This paper describes our system in SemEval-2018 task 12 : Argument Reasoning Comprehension. The task is to select the correct warrant that explains reasoning of a particular argument consisting of a claim and a reason. The main idea of our methods is based on the assumption that the semantic composition of the reason and the warrant should be close to the semantic representation of the corresponding claim. We propose two neural network models. The first one considers two warrant candidates simultaneously, while the second one processes each candidate separately and then chooses the best one. We also incorporate sentiment polarity by assuming that there are kinds of sentiment associations between the reason, the warrant and the claim. The experiments show that the first framework is more effective and sentiment polarity is useful.

up

pdf (full)
bib (full)
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics

pdf bib
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics
Malvina Nissim | Jonathan Berant | Alessandro Lenci

pdf bib
Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization
Kian Kenyon-Dean | Jackie Chi Kit Cheung | Doina Precup

We present an approach to event coreference resolution by developing a general framework for clustering that uses supervised representation learning. We propose a neural network architecture with novel Clustering-Oriented Regularization (CORE) terms in the objective function. These terms encourage the model to create embeddings of event mentions that are amenable to clustering. We then use agglomerative clustering on these embeddings to build event coreference chains. For both within- and cross-document coreference on the ECB+ corpus, our model obtains better results than models that require significantly more pre-annotated information. This work provides insight and motivating results for a new general approach to solving coreference and clustering problems with representation learning.

pdf bib
Learning distributed event representations with a multi-task approach
Xudong Hong | Asad Sayeed | Vera Demberg

Human world knowledge contains information about prototypical events and their participants and locations. In this paper, we train the first models using multi-task learning that can both predict missing event participants and also perform semantic role classification based on semantic plausibility. Our best-performing model is an improvement over the previous state-of-the-art on thematic fit modelling tasks. The event embeddings learned by the model can additionally be used effectively in an event similarity task, also outperforming the state-of-the-art.

pdf bib
Assessing Meaning Components in German Complex Verbs : A Collection of Source-Target Domains and DirectionalityGerman Complex Verbs: A Collection of Source-Target Domains and Directionality
Sabine Schulte im Walde | Maximilian Köper | Sylvia Springorum

This paper presents a collection to assess meaning components in German complex verbs, which frequently undergo meaning shifts. We use a novel strategy to obtain source and target domain characterisations via sentence generation rather than sentence annotation. A selection of arrows adds spatial directional information to the generated contexts. We provide a broad qualitative description of the dataset, and a series of standard classification experiments verifies the quantitative reliability of the presented resource. The setup for collecting the meaning components is applicable also to other languages, regarding complex verbs as well as other language-specific targets that involve meaning shifts.

pdf bib
Mixing Context Granularities for Improved Entity Linking on Question Answering Data across Entity Categories
Daniil Sorokin | Iryna Gurevych

The first stage of every knowledge base question answering approach is to link entities in the input question. We investigate entity linking in the context of question answering task and present a jointly optimized neural architecture for entity mention detection and entity disambiguation that models the surrounding context on different levels of granularity. We use the Wikidata knowledge base and available question answering datasets to create benchmarks for entity linking on question answering data. Our approach outperforms the previous state-of-the-art system on this data, resulting in an average 8 % improvement of the final score. We further demonstrate that our model delivers a strong performance across different entity categories.

pdf bib
Quantitative Semantic Variation in the Contexts of Concrete and Abstract Words
Daniela Naumann | Diego Frassinelli | Sabine Schulte im Walde

Across disciplines, researchers are eager to gain insight into empirical features of abstract vs. concrete concepts. In this work, we provide a detailed characterisation of the distributional nature of abstract and concrete words across 16,620 English nouns, verbs and adjectives. Specifically, we investigate the following questions : (1) What is the distribution of concreteness in the contexts of concrete and abstract target words? (2) What are the differences between concrete and abstract words in terms of contextual semantic diversity? (3) How does the entropy of concrete and abstract word contexts differ? Overall, our studies show consistent differences in the distributional representation of concrete and abstract words, thus challenging existing theories of cognition and providing a more fine-grained description of their nature.

pdf bib
The Limitations of Cross-language Word Embeddings Evaluation
Amir Bakarov | Roman Suvorov | Ilya Sochenkov

The aim of this work is to explore the possible limitations of existing methods of cross-language word embeddings evaluation, addressing the lack of correlation between intrinsic and extrinsic cross-language evaluation methods. To prove this hypothesis, we construct English-Russian datasets for extrinsic and intrinsic evaluation tasks and compare performances of 5 different cross-language models on them. The results say that the scores even on different intrinsic benchmarks do not correlate to each other. We can conclude that the use of human references as ground truth for cross-language word embeddings is not proper unless one does not understand how do native speakers process semantics in their cognition.

pdf bib
How Gender and Skin Tone Modifiers Affect Emoji Semantics in TwitterTwitter
Francesco Barbieri | Jose Camacho-Collados

In this paper we analyze the use of emojis in social media with respect to gender and skin tone. By gathering a dataset of over twenty two million tweets from United States some findings are clearly highlighted after performing a simple frequency-based analysis. Moreover, we carry out a semantic analysis on the usage of emojis and their modifiers (e.g. gender and skin tone) by embedding all words, emojis and modifiers into the same vector space. Our analyses reveal that some stereotypes related to the skin color and gender seem to be reflected on the use of these modifiers. For example, emojis representing hand gestures are more widely utilized with lighter skin tones, and the usage across skin tones differs significantly. At the same time, the vector corresponding to the male modifier tends to be semantically close to emojis related to business or technology, whereas their female counterparts appear closer to emojis about love or makeup.

pdf bib
Learning Patient Representations from Text
Dmitriy Dligach | Timothy Miller

Mining electronic health records for patients who satisfy a set of predefined criteria is known in medical informatics as phenotyping. Phenotyping has numerous applications such as outcome prediction, clinical trial recruitment, and retrospective studies. Supervised machine learning for phenotyping typically relies on sparse patient representations such as bag-of-words. We consider an alternative that involves learning patient representations. We develop a neural network model for learning patient representations and show that the learned representations are general enough to obtain state-of-the-art performance on a standard comorbidity detection task.

pdf bib
Polarity Computations in Flexible Categorial Grammar
Hai Hu | Larry Moss

This paper shows how to take parse trees in CCG and algorithmically find the polarities of all the constituents. Our work uses the well-known polarization principle corresponding to function application, and we have extended this with principles for type raising and composition. We provide an algorithm, extending the polarity marking algorithm of van Benthem. We discuss how our system works in practice, taking input from the C&C parser.

pdf bib
Halo : Learning Semantics-Aware Representations for Cross-Lingual Information ExtractionHalo: Learning Semantics-Aware Representations for Cross-Lingual Information Extraction
Hongyuan Mei | Sheng Zhang | Kevin Duh | Benjamin Van Durme

Cross-lingual information extraction (CLIE) is an important and challenging task, especially in low resource scenarios. To tackle this challenge, we propose a training method, called Halo, which enforces the local region of each hidden state of a neural model to only generate target tokens with the same semantic structure tag. This simple but powerful technique enables a neural model to learn semantics-aware representations that are robust to noise, without introducing any extra parameter, thus yielding better generalization in both high and low resource settings.Halo, which enforces the local region of each hidden state of a neural model\n to only generate target tokens with the same semantic structure tag. This\n simple but powerful technique enables a neural model to learn\n semantics-aware representations that are robust to noise, without\n introducing any extra parameter, thus yielding better generalization in\n both high and low resource settings.\n

pdf bib
Predicting Word Embeddings Variability
Bénédicte Pierrejean | Ludovic Tanguy

Neural word embeddings models (such as those built with word2vec) are known to have stability problems : when retraining a model with the exact same hyperparameters, words neighborhoods may change. We propose a method to estimate such variation, based on the overlap of neighbors of a given word in two models trained with identical hyperparameters. We show that this inherent variation is not negligible, and that it does not affect every word in the same way. We examine the influence of several features that are intrinsic to a word, corpus or embedding model and provide a methodology that can predict the variability (and as such, reliability) of a word representation in a semantic vector space.

pdf bib
Integrating Multiplicative Features into Supervised Distributional Methods for Lexical Entailment
Tu Vu | Vered Shwartz

Supervised distributional methods are applied successfully in lexical entailment, but recent work questioned whether these methods actually learn a relation between two words. Specifically, Levy et al. (2015) claimed that linear classifiers learn only separate properties of each word. We suggest a cheap and easy way to boost the performance of these methods by integrating multiplicative features into commonly used representations. We provide an extensive evaluation with different classifiers and evaluation setups, and suggest a suitable evaluation setup for the task, eliminating biases existing in previous ones.

pdf bib
Deep Affix Features Improve Neural Named Entity Recognizers
Vikas Yadav | Rebecca Sharp | Steven Bethard

We propose a practical model for named entity recognition (NER) that combines word and character-level information with a specific learned representation of the prefixes and suffixes of the word. We apply this approach to multilingual and multi-domain NER and show that it achieves state of the art results on the CoNLL 2002 Spanish and Dutch and CoNLL 2003 German NER datasets, consistently achieving 1.5-2.3 percent over the state of the art without relying on any dictionary features. Additionally, we show improvement on SemEval 2013 task 9.1 DrugNER, achieving state of the art results on the MedLine dataset and the second best results overall (-1.3 % from state of the art). We also establish a new benchmark on the I2B2 2010 Clinical NER dataset with 84.70 F-score.

pdf bib
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak | Jason Naradowsky | Aparajita Haldar | Rachel Rudinger | Benjamin Van Durme

We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on 10 distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority-class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

pdf bib
Term Definitions Help Hypernymy Detection
Wenpeng Yin | Dan Roth

Existing methods of hypernymy detection mainly rely on statistics over a big corpus, either mining some co-occurring patterns like animals such as cats or embedding words of interest into context-aware vectors. These approaches are therefore limited by the availability of a large enough corpus that can cover all terms of interest and provide sufficient contextual information to represent their meaning. In this work, we propose a new paradigm, HyperDef, for hypernymy detection expressing word meaning by encoding word definitions, along with context driven representation. This has two main benefits : (i) Definitional sentences express (sense-specific) corpus-independent meanings of words, hence definition-driven approaches enable strong generalization once trained, the model is expected to work well in open-domain testbeds ; (ii) Global context from a large corpus and definitions provide complementary information for words. Consequently, our model, HyperDef, once trained on task-agnostic data, gets state-of-the-art results in multiple benchmarks

pdf bib
Agree or Disagree : Predicting Judgments on Nuanced Assertions
Michael Wojatzki | Torsten Zesch | Saif Mohammad | Svetlana Kiritchenko

Being able to predict whether people agree or disagree with an assertion (i.e. an explicit, self-contained statement) has several applications ranging from predicting how many people will like or dislike a social media post to classifying posts based on whether they are in accordance with a particular point of view. We formalize this as two NLP tasks : predicting judgments of (i) individuals and (ii) groups based on the text of the assertion and previous judgments. We evaluate a wide range of approaches on a crowdsourced data set containing over 100,000 judgments on over 2,000 assertions. We find that predicting individual judgments is a hard task with our best results only slightly exceeding a majority baseline, but that judgments of groups can be more reliably predicted using a Siamese neural network, which outperforms all other approaches by a wide margin.

pdf bib
Measuring Frame Instance Relatedness
Valerio Basile | Roque Lopez Condori | Elena Cabrio

Frame semantics is a well-established framework to represent the meaning of natural language in computational terms. In this work, we aim to propose a quantitative measure of relatedness between pairs of frame instances. We test our method on a dataset of sentence pairs, highlighting the correlation between our metric and human judgments of semantic similarity. Furthermore, we propose an application of our measure for clustering frame instances to extract prototypical knowledge from natural language.

pdf bib
Solving Feature Sparseness in Text Classification using Core-Periphery Decomposition
Xia Cui | Sadamori Kojaku | Naoki Masuda | Danushka Bollegala

Feature sparseness is a problem common to cross-domain and short-text classification tasks. To overcome this feature sparseness problem, we propose a novel method based on graph decomposition to find candidate features for expanding feature vectors. Specifically, we first create a feature-relatedness graph, which is subsequently decomposed into core-periphery (CP) pairs and use the peripheries as the expansion candidates of the cores. We expand both training and test instances using the computed related features and use them to train a text classifier. We observe that prioritising features that are common to both training and test instances as cores during the CP decomposition to further improve the accuracy of text classification. We evaluate the proposed CP-decomposition-based feature expansion method on benchmark datasets for cross-domain sentiment classification and short-text classification. Our experimental results show that the proposed method consistently outperforms all baselines on short-text classification tasks, and perform competitively with pivot-based cross-domain sentiment classification methods.