Sharpness-Aware Minimization for Topic Models with High-Quality Document Representations

Tung Nguyen, Tue Le, Hoang Tran Vuong, Quang Duc Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen


Abstract
Recent advanced frameworks in topic models have significantly enhanced the performance compared to conventional probabilistic approaches. Such models, mostly constructed from neural network architecture together with other advanced techniques such as contextual embedding, optimal transport distance and pre-trained language model, etc. have effectively improved the topic quality and document topic distribution. Despite the improvements, these methods lack considerations of effective optimization for complex objective functions that contain log-likelihood and additional regularization terms. In this study, we propose to apply an efficient optimization method to improve the generalization and performance of topic models. Our approach explicitly considers the sharpness of the loss landscape during optimization, which forces the optimizer to choose directions in the parameter space that lead to flatter minima, in which the models are typically more stable and robust to small perturbations in the data. Additionally, we propose an effective strategy to select the flatness region for parameter optimization by leveraging the optimal transport distance between doc-topic distributions and doc-cluster proportions, which can effectively enhance document representation. Experimental results on popular benchmark datasets demonstrate that our method effectively improves the performance of baseline topic models.
Anthology ID:
2025.naacl-long.231
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4507–4524
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.naacl-long.231/
DOI:
Bibkey:
Cite (ACL):
Tung Nguyen, Tue Le, Hoang Tran Vuong, Quang Duc Nguyen, Duc Anh Nguyen, Linh Ngo Van, Sang Dinh, and Thien Huu Nguyen. 2025. Sharpness-Aware Minimization for Topic Models with High-Quality Document Representations. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4507–4524, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Sharpness-Aware Minimization for Topic Models with High-Quality Document Representations (Nguyen et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.naacl-long.231.pdf