HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization

Kazuya Shimura, Jiyi Li, Fumiyo Fukumoto


Abstract
We focus on the multi-label categorization task for short texts and explore the use of a hierarchical structure (HS) of categories. In contrast to the existing work using non-hierarchical flat model, the method leverages the hierarchical relations between the pre-defined categories to tackle the data sparsity problem. The lower the HS level, the less the categorization performance. Because the number of training data per category in a lower level is much smaller than that in an upper level. We propose an approach which can effectively utilize the data in the upper levels to contribute the categorization in the lower levels by applying the Convolutional Neural Network (CNN) with a fine-tuning technique. The results using two benchmark datasets show that proposed method, Hierarchical Fine-Tuning based CNN (HFT-CNN) is competitive with the state-of-the-art CNN based methods.
Anthology ID:
D18-1093
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
811–816
Language:
URL:
https://aclanthology.org/D18-1093
DOI:
10.18653/v1/D18-1093
Bibkey:
Cite (ACL):
Kazuya Shimura, Jiyi Li, and Fumiyo Fukumoto. 2018. HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 811–816, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
HFT-CNN: Learning Hierarchical Category Structure for Multi-label Short Text Categorization (Shimura et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/D18-1093.pdf
Attachment:
 D18-1093.Attachment.zip
Code
 ShimShim46/HFT-CNN
Data
RCV1