Softmax Tree: An Accurate, Fast Classifier When the Number of Classes Is Large

Arman Zharmagambetov, Magzhan Gabidolla, Miguel A. Carreira-Perpinan


Abstract
Classification problems having thousands or more classes naturally occur in NLP, for example language models or document classification. A softmax or one-vs-all classifier naturally handles many classes, but it is very slow at inference time, because every class score must be calculated to find the top class. We propose the “softmax tree”, consisting of a binary tree having sparse hyperplanes at the decision nodes (which make hard, not soft, decisions) and small softmax classifiers at the leaves. This is much faster at inference because the input instance follows a single path to a leaf (whose length is logarithmic on the number of leaves) and the softmax classifier at each leaf operates on a small subset of the classes. Although learning accurate tree-based models has proven difficult in the past, we are able to overcome this by using a variation of a recent algorithm, tree alternating optimization (TAO). Compared to a softmax and other classifiers, the resulting softmax trees are both more accurate in prediction and faster in inference, as shown in NLP problems having from one thousand to one hundred thousand classes.
Anthology ID:
2021.emnlp-main.838
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10730–10745
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.838/
DOI:
10.18653/v1/2021.emnlp-main.838
Bibkey:
Cite (ACL):
Arman Zharmagambetov, Magzhan Gabidolla, and Miguel A. Carreira-Perpinan. 2021. Softmax Tree: An Accurate, Fast Classifier When the Number of Classes Is Large. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10730–10745, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
Softmax Tree: An Accurate, Fast Classifier When the Number of Classes Is Large (Zharmagambetov et al., EMNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.838.pdf
Video:
 https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.838.mp4