Qian Sun


2021

pdf
Evaluating Hierarchical Document Categorisation
Qian Sun | Aili Shen | Hiyori Yoshikawa | Chunpeng Ma | Daniel Beck | Tomoya Iwakura | Timothy Baldwin
Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association

Hierarchical document categorisation is a special case of multi-label document categorisation, where there is a taxonomic hierarchy among the labels. While various approaches have been proposed for hierarchical document categorisation, there is no standard benchmark dataset, resulting in different methods being evaluated independently and there being no empirical consensus on what methods perform best. In this work, we examine different combinations of neural text encoders and hierarchical methods in an end-to-end framework, and evaluate over three datasets. We find that the performance of hierarchical document categorisation is determined not only by how the hierarchical information is modelled, but also the structure of the label hierarchy and class distribution.