SGM: Sequence Generation Model for Multi-label Classification
Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu, Houfeng Wang
Abstract
Multi-label classification is an important yet challenging task in natural language processing. It is more complex than single-label classification in that the labels tend to be correlated. Existing methods tend to ignore the correlations between labels. Besides, different parts of the text can contribute differently for predicting different labels, which is not considered by existing models. In this paper, we propose to view the multi-label classification task as a sequence generation problem, and apply a sequence generation model with a novel decoder structure to solve it. Extensive experimental results show that our proposed methods outperform previous work by a substantial margin. Further analysis of experimental results demonstrates that the proposed methods not only capture the correlations between labels, but also select the most informative words automatically when predicting different labels.- Anthology ID:
- C18-1330
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3915–3926
- Language:
- URL:
- https://aclanthology.org/C18-1330
- DOI:
- Cite (ACL):
- Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu, and Houfeng Wang. 2018. SGM: Sequence Generation Model for Multi-label Classification. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3915–3926, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- SGM: Sequence Generation Model for Multi-label Classification (Yang et al., COLING 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/C18-1330.pdf
- Code
- lancopku/SGM
- Data
- RCV1