Neural Attentive Bag-of-Entities Model for Text Classification

Ikuya Yamada, Hiroyuki Shindo


Abstract
This study proposes a Neural Attentive Bag-of-Entities model, which is a neural network model that performs text classification using entities in a knowledge base. Entities provide unambiguous and relevant semantic signals that are beneficial for text classification. We combine simple high-recall entity detection based on a dictionary, to detect entities in a document, with a novel neural attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities. We tested the effectiveness of our model using two standard text classification datasets (i.e., the 20 Newsgroups and R8 datasets) and a popular factoid question answering dataset based on a trivia quiz game. As a result, our model achieved state-of-the-art results on all datasets. The source code of the proposed model is available online at https://github.com/wikipedia2vec/wikipedia2vec.
Anthology ID:
K19-1052
Volume:
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Mohit Bansal, Aline Villavicencio
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
563–573
Language:
URL:
https://aclanthology.org/K19-1052
DOI:
10.18653/v1/K19-1052
Bibkey:
Cite (ACL):
Ikuya Yamada and Hiroyuki Shindo. 2019. Neural Attentive Bag-of-Entities Model for Text Classification. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 563–573, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Neural Attentive Bag-of-Entities Model for Text Classification (Yamada & Shindo, CoNLL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/K19-1052.pdf
Code
 wikipedia2vec/wikipedia2vec +  additional community code