Agnes Masip Gomez
2023
Enhancing Extreme Multi-Label Text Classification: Addressing Challenges in Model, Data, and Evaluation
Dan Li
|
Zi Long Zhu
|
Janneke van de Loo
|
Agnes Masip Gomez
|
Vikrant Yadav
|
Georgios Tsatsaronis
|
Zubair Afzal
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Extreme multi-label text classification is a prevalent task in industry, but it frequently encounters challenges in terms of machine learning perspectives, including model limitations, data scarcity, and time-consuming evaluation. This paper aims to mitigate these issues by introducing novel approaches. Firstly, we propose a label ranking model as an alternative to the conventional SciBERT-based classification model, enabling efficient handling of large-scale labels and accommodating new labels. Secondly, we present an active learning-based pipeline that addresses the data scarcity of new labels during the update of a classification system. Finally, we introduce ChatGPT to assist with model evaluation. Our experiments demonstrate the effectiveness of these techniques in enhancing the extreme multi-label text classification task.
Search
Co-authors
- Dan Li 1
- Zi Long Zhu 1
- Janneke van de Loo 1
- Vikrant Yadav 1
- Georgios Tsatsaronis 1
- show all...