Hiroya Takamura


2021

pdf bib
Making Your Tweets More Fancy : Emoji Insertion to Texts
Jingun Kwon | Naoki Kobayashi | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

In the social media, users frequently use small images called emojis in their posts. Although using emojis in texts plays a key role in recent communication systems, less attention has been paid on their positions in the given texts, despite that users carefully choose and put an emoji that matches their post. Exploring positions of emojis in texts will enhance understanding of the relationship between emojis and texts. We extend an emoji label prediction task taking into account the information of emoji positions, by jointly learning the emoji position in a tweet to predict the emoji label. The results demonstrate that the position of emojis in texts is a good clue to boost the performance of emoji label prediction. Human evaluation validates that there exists a suitable emoji position in a tweet, and our proposed task is able to make tweets more fancy and natural. In addition, considering emoji position can further improve the performance for the irony detection task compared to the emoji label prediction. We also report the experimental results for the modified dataset, due to the problem of the original dataset for the first shared task to predict an emoji label in SemEval2018.

pdf bib
Generic Mechanism for Reducing Repetitions in Encoder-Decoder Models
Ying Zhang | Hidetaka Kamigaito | Tatsuya Aoki | Hiroya Takamura | Manabu Okumura
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Encoder-decoder models have been commonly used for many tasks such as machine translation and response generation. As previous research reported, these models suffer from generating redundant repetition. In this research, we propose a new mechanism for encoder-decoder models that estimates the semantic difference of a source sentence before and after being fed into the encoder-decoder model to capture the consistency between two sides. This mechanism helps reduce repeatedly generated tokens for a variety of tasks. Evaluation results on publicly available machine translation and response generation datasets demonstrate the effectiveness of our proposal.

pdf bib
Generating Weather Comments from Meteorological Simulations
Soichiro Murakami | Sora Tanaka | Masatsugu Hangyo | Hidetaka Kamigaito | Kotaro Funakoshi | Hiroya Takamura | Manabu Okumura
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The task of generating weather-forecast comments from meteorological simulations has the following requirements : (i) the changes in numerical values for various physical quantities need to be considered, (ii) the weather comments should be dependent on delivery time and area information, and (iii) the comments should provide useful information for users. To meet these requirements, we propose a data-to-text model that incorporates three types of encoders for numerical forecast maps, observation data, and meta-data. We also introduce weather labels representing weather information, such as sunny and rain, for our model to explicitly describe useful information. We conducted automatic and human evaluations. The results indicate that our model performed best against baselines in terms of informativeness. We make our code and data publicly available.

pdf bib
One-class Text Classification with Multi-modal Deep Support Vector Data Description
Chenlong Hu | Yukun Feng | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

This work presents multi-modal deep SVDD (mSVDD) for one-class text classification. By extending the uni-modal SVDD to a multiple modal one, we build mSVDD with multiple hyperspheres, that enable us to build a much better description for target one-class data. Additionally, the end-to-end architecture of mSVDD can jointly handle neural feature learning and one-class text learning. We also introduce a mechanism for incorporating negative supervision in the absence of real negative data, which can be beneficial to the mSVDD model. We conduct experiments on Reuters and 20 Newsgroup datasets, and the experimental results demonstrate that mSVDD outperforms uni-modal SVDD and mSVDD can get further improvements when negative supervision is incorporated.

2020

pdf bib
Learning with Contrastive Examples for Data-to-Text Generation
Yui Uehara | Tatsuya Ishigaki | Kasumi Aoki | Hiroshi Noji | Keiichi Goshima | Ichiro Kobayashi | Hiroya Takamura | Yusuke Miyao
Proceedings of the 28th International Conference on Computational Linguistics

Existing models for data-to-text tasks generate fluent but sometimes incorrect sentences e.g., Nikkei gains is generated when Nikkei drops is expected. We investigate models trained on contrastive examples i.e., incorrect sentences or terms, in addition to correct ones to reduce such errors. We first create rules to produce contrastive examples from correct ones by replacing frequent crucial terms such as gain or drop. We then use learning methods with several losses that exploit contrastive examples. Experiments on the market comment generation task show that 1) exploiting contrastive examples improves the capability of generating sentences with better lexical choice, without degrading the fluency, 2) the choice of the loss function is an important factor because the performances on different metrics depend on the types of loss functions, and 3) the use of the examples produced by some specific rules further improves performance. Human evaluation also supports the effectiveness of using contrastive examples.

pdf bib
An empirical analysis of existing systems and datasets toward general simple question answering
Namgi Han | Goran Topic | Hiroshi Noji | Hiroya Takamura | Yusuke Miyao
Proceedings of the 28th International Conference on Computational Linguistics

In this paper, we evaluate the progress of our field toward solving simple factoid questions over a knowledge base, a practically important problem in natural language interface to database. As in other natural language understanding tasks, a common practice for this task is to train and evaluate a model on a single dataset, and recent studies suggest that SimpleQuestions, the most popular and largest dataset, is nearly solved under this setting. However, this common setting does not evaluate the robustness of the systems outside of the distribution of the used training data. We rigorously evaluate such robustness of existing systems using different datasets. Our analysis, including shifting of training and test datasets and training on a union of the datasets, suggests that our progress in solving SimpleQuestions dataset does not indicate the success of more general simple question answering. We discuss a possible future direction toward this goal.

pdf bib
mgsohrab at WNUT 2020 Shared Task-1 : Neural Exhaustive Approach for Entity and Relation Recognition Over Wet Lab ProtocolsWNUT 2020 Shared Task-1: Neural Exhaustive Approach for Entity and Relation Recognition Over Wet Lab Protocols
Mohammad Golam Sohrab | Anh-Khoa Duong Nguyen | Makoto Miwa | Hiroya Takamura
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)

We present a neural exhaustive approach that addresses named entity recognition (NER) and relation recognition (RE), for the entity and re- lation recognition over the wet-lab protocols shared task. We introduce BERT-based neural exhaustive approach that enumerates all pos- sible spans as potential entity mentions and classifies them into entity types or no entity with deep neural networks to address NER. To solve relation extraction task, based on the NER predictions or given gold mentions we create all possible trigger-argument pairs and classify them into relation types or no relation. In NER task, we achieved 76.60 % in terms of F-score as third rank system among the partic- ipated systems. In relation extraction task, we achieved 80.46 % in terms of F-score as the top system in the relation extraction or recognition task. Besides we compare our model based on the wet lab protocols corpus (WLPC) with the WLPC baseline and dynamic graph-based in- formation extraction (DyGIE) systems.

2019

pdf bib
A Neural Pipeline Approach for the PharmaCoNER Shared Task using Contextual Exhaustive ModelsPharmaCoNER Shared Task using Contextual Exhaustive Models
Mohammad Golam Sohrab | Minh Thang Pham | Makoto Miwa | Hiroya Takamura
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks

We present a neural pipeline approach that performs named entity recognition (NER) and concept indexing (CI), which links them to concept unique identifiers (CUIs) in a knowledge base, for the PharmaCoNER shared task on pharmaceutical drugs and chemical entities. We proposed a neural NER model that captures the surrounding semantic information of a given sequence by capturing the forward- and backward-context of bidirectional LSTM (Bi-LSTM) output of a target span using contextual span representation-based exhaustive approach. The NER model enumerates all possible spans as potential entity mentions and classify them into entity types or no entity with deep neural networks. For representing span, we compare several different neural network architectures and their ensembling for the NER model. We then perform dictionary matching for CI and, if there is no matching, we further compute similarity scores between a mention and CUIs using entity embeddings to assign the CUI with the highest score to the mention. We evaluate our approach on the two sub-tasks in the shared task. Among the five submitted runs, the best run for each sub-task achieved the F-score of 86.76 % on Sub-task 1 (NER) and the F-score of 79.97 % (strict) on Sub-task 2 (CI).

pdf bib
Proceedings of the First Workshop on Financial Technology and Natural Language Processing
Chung-Chi Chen | Hen-Hsen Huang | Hiroya Takamura | Hsin-Hsi Chen
Proceedings of the First Workshop on Financial Technology and Natural Language Processing

pdf bib
Proceedings of the 12th International Conference on Natural Language Generation
Kees van Deemter | Chenghua Lin | Hiroya Takamura
Proceedings of the 12th International Conference on Natural Language Generation

pdf bib
Controlling Contents in Data-to-Document Generation with Human-Designed Topic Labels
Kasumi Aoki | Akira Miyazawa | Tatsuya Ishigaki | Tatsuya Aoki | Hiroshi Noji | Keiichi Goshima | Ichiro Kobayashi | Hiroya Takamura | Yusuke Miyao
Proceedings of the 12th International Conference on Natural Language Generation

We propose a data-to-document generator that can easily control the contents of output texts based on a neural language model. Conventional data-to-text model is useful when a reader seeks a global summary of data because it has only to describe an important part that has been extracted beforehand. However, because depending on users, it differs what they are interested in, so it is necessary to develop a method to generate various summaries according to users’ interests. We develop a model to generate various summaries and to control their contents by providing the explicit targets for a reference to the model as controllable factors. In the experiments, we used five-minute or one-hour charts of 9 indicators (e.g., Nikkei225), as time-series data, and daily summaries of Nikkei Quick News as textual data. We conducted comparative experiments using two pieces of information : human-designed topic labels indicating the contents of a sentence and automatically extracted keywords as the referential information for generation.

pdf bib
Global Optimization under Length Constraint for Neural Text Summarization
Takuya Makino | Tomoya Iwakura | Hiroya Takamura | Manabu Okumura
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We propose a global optimization method under length constraint (GOLC) for neural text summarization models. GOLC increases the probabilities of generating summaries that have high evaluation scores, ROUGE in this paper, within a desired length. We compared GOLC with two optimization methods, a maximum log-likelihood and a minimum risk training, on CNN / Daily Mail and a Japanese single document summarization data set of The Mainichi Shimbun Newspapers. The experimental results show that a state-of-the-art neural summarization model optimized with GOLC generates fewer overlength summaries while maintaining the fastest processing speed ; only 6.70 % overlength summaries on CNN / Daily and 7.8 % on long summary of Mainichi, compared to the approximately 20 % to 50 % on CNN / Daily Mail and 10 % to 30 % on Mainichi with the other optimization methods. We also demonstrate the importance of the generation of in-length summaries for post-editing with the dataset Mainich that is created with strict length constraints. The ex- perimental results show approximately 30 % to 40 % improved post-editing time by use of in-length summaries.

pdf bib
A Simple and Effective Method for Injecting Word-Level Information into Character-Aware Neural Language Models
Yukun Feng | Hidetaka Kamigaito | Hiroya Takamura | Manabu Okumura
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

We propose a simple and effective method to inject word-level information into character-aware neural language models. Unlike previous approaches which usually inject word-level information at the input of a long short-term memory (LSTM) network, we inject it into the softmax function. The resultant model can be seen as a combination of character-aware language model and simple word-level language model. Our injection method can also be used together with previous methods. Through the experiments on 14 typologically diverse languages, we empirically show that our injection method, when used together with the previous methods, works better than the previous methods, including a gating mechanism, averaging, and concatenation of word vectors. We also provide a comprehensive comparison of these injection methods.

2018

pdf bib
Exploring the Influence of Spelling Errors on Lexical Variation Measures
Ryo Nagata | Taisei Sato | Hiroya Takamura
Proceedings of the 27th International Conference on Computational Linguistics

This paper explores the influence of spelling errors on lexical variation measures. Lexical richness measures such as Type-Token Ration (TTR) and Yule’s K are often used for learner English analysis and assessment. When applied to learner English, however, they can be unreliable because of the spelling errors appearing in it. Namely, they are, directly or indirectly, based on the counts of distinct word types, and spelling errors undesirably increase the number of distinct words. This paper introduces and examines the hypothesis that lexical richness measures become unstable in learner English because of spelling errors. Specifically, it tests the hypothesis on English learner corpora of three groups (middle school, high school, and college students). To be precise, it estimates the difference in TTR and Yule’s K caused by spelling errors, by calculating their values before and after spelling errors are manually corrected. Furthermore, it examines the results theoretically and empirically to deepen the understanding of the influence of spelling errors on them.

pdf bib
Neural Machine Translation Incorporating Named Entity
Arata Ugawa | Akihiro Tamura | Takashi Ninomiya | Hiroya Takamura | Manabu Okumura
Proceedings of the 27th International Conference on Computational Linguistics

This study proposes a new neural machine translation (NMT) model based on the encoder-decoder model that incorporates named entity (NE) tags of source-language sentences. Conventional NMT models have two problems enumerated as follows : (i) they tend to have difficulty in translating words with multiple meanings because of the high ambiguity, and (ii) these models’abilitytotranslatecompoundwordsseemschallengingbecausetheencoderreceivesaword, a part of the compound word, at each time step. To alleviate these problems, the encoder of the proposed model encodes the input word on the basis of its NE tag at each time step, which could reduce the ambiguity of the input word. Furthermore, the encoder introduces a chunk-level LSTM layer over a word-level LSTM layer and hierarchically encodes a source-language sentence to capture a compound NE as a chunk on the basis of the NE tags. We evaluate the proposed model on an English-to-Japanese translation task with the ASPEC, and English-to-Bulgarian and English-to-Romanian translation tasks with the Europarl corpus. The evaluation results show that the proposed model achieves up to 3.11 point improvement in BLEU.

pdf bib
Stylistically User-Specific Generation
Abdurrisyad Fikri | Hiroya Takamura | Manabu Okumura
Proceedings of the 11th International Conference on Natural Language Generation

Recent neural models for response generation show good results in terms of general responses. In real conversations, however, depending on the speaker / responder, similar utterances should require different responses. In this study, we attempt to consider individual user’s information in adjusting the notable sequence-to-sequence (seq2seq) model for more diverse, user-specific responses. We assume that we need user-specific features to adjust the response and we argue that some selected representative words from the users are suitable for this task. Furthermore, we prove that even for unseen or unknown users, our model can provide more diverse and interesting responses, while maintaining correlation with input utterances. Experimental results with human evaluation show that our model can generate more interesting responses than the popular seq2seqmodel and achieve higher relevance with input utterances than our baseline.

pdf bib
Generating Market Comments Referring to External Resources
Tatsuya Aoki | Akira Miyazawa | Tatsuya Ishigaki | Keiichi Goshima | Kasumi Aoki | Ichiro Kobayashi | Hiroya Takamura | Yusuke Miyao
Proceedings of the 11th International Conference on Natural Language Generation

Comments on a stock market often include the reason or cause of changes in stock prices, such as Nikkei turns lower as yen’s rise hits exporters. Generating such informative sentences requires capturing the relationship between different resources, including a target stock price. In this paper, we propose a model for automatically generating such informative market comments that refer to external resources. We evaluated our model through an automatic metric in terms of BLEU and human evaluation done by an expert in finance. The results show that our model outperforms the existing model both in BLEU scores and human judgment.

2017

pdf bib
Supervised Attention for Sequence-to-Sequence Constituency Parsing
Hidetaka Kamigaito | Katsuhiko Hayashi | Tsutomu Hirao | Hiroya Takamura | Manabu Okumura | Masaaki Nagata
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

The sequence-to-sequence (Seq2Seq) model has been successfully applied to machine translation (MT). Recently, MT performances were improved by incorporating supervised attention into the model. In this paper, we introduce supervised attention to constituency parsing that can be regarded as another translation task. Evaluation results on the PTB corpus showed that the bracketing F-measure was improved by supervised attention.

pdf bib
Learning to Generate Market Comments from Stock Prices
Soichiro Murakami | Akihiko Watanabe | Akira Miyazawa | Keiichi Goshima | Toshihiko Yanase | Hiroya Takamura | Yusuke Miyao
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper presents a novel encoder-decoder model for automatically generating market comments from stock prices. The model first encodes both short- and long-term series of stock prices so that it can mention short- and long-term changes in stock prices. In the decoding phase, our model can also generate a numerical value by selecting an appropriate arithmetic operation such as subtraction or rounding, and applying it to the input stock prices. Empirical experiments show that our best model generates market comments at the fluency and the informativeness approaching human-generated reference texts.

pdf bib
Japanese Sentence Compression with a Large Training DatasetJapanese Sentence Compression with a Large Training Dataset
Shun Hasegawa | Yuta Kikuchi | Hiroya Takamura | Manabu Okumura
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

In English, high-quality sentence compression models by deleting words have been trained on automatically created large training datasets. We work on Japanese sentence compression by a similar approach. To create a large Japanese training dataset, a method of creating English training dataset is modified based on the characteristics of the Japanese language. The created dataset is used to train Japanese sentence compression models based on the recurrent neural network.

pdf bib
UINSUSKA-TiTech at SemEval-2017 Task 3 : Exploiting Word Importance Levels for Similarity Features for CQAUINSUSKA-TiTech at SemEval-2017 Task 3: Exploiting Word Importance Levels for Similarity Features for CQA
Surya Agustian | Hiroya Takamura
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

The majority of core techniques to solve many problems in Community Question Answering (CQA) task rely on similarity computation. This work focuses on similarity between two sentences (or questions in subtask B) based on word embeddings. We exploit words importance levels in sentences or questions for similarity features, for classification and ranking with machine learning. Using only 2 types of similarity metric, our proposed method has shown comparable results with other complex systems. This method on subtask B 2017 dataset is ranked on position 7 out of 13 participants. Evaluation on 2016 dataset is on position 8 of 12, outperforms some complex systems. Further, this finding is explorable and potential to be used as baseline and extensible for many tasks in CQA and other textual similarity based system.

pdf bib
Distinguishing Japanese Non-standard Usages from Standard OnesJapanese Non-standard Usages from Standard Ones
Tatsuya Aoki | Ryohei Sasano | Hiroya Takamura | Manabu Okumura
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We focus on non-standard usages of common words on social media. In the context of social media, words sometimes have other usages that are totally different from their original. In this study, we attempt to distinguish non-standard usages on social media from standard ones in an unsupervised manner. Our basic idea is that non-standardness can be measured by the inconsistency between the expected meaning of the target word and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and provide us with findings, for example, on how to construct context embeddings and which corpus to use.