Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities

Minh Duc Chu, Zihao He, Rebecca Dorn, Kristina Lerman


Abstract
Large language models (LLMs) have shown promise in representing individuals and communities, offering new ways to study complex social dynamics. However, effectively aligning LLMs with specific human groups and systematically assessing the fidelity of the alignment remains a challenge. This paper presents a robust framework for aligning LLMs with online communities via instruction-tuning and comprehensively evaluating alignment across various aspects of language, including authenticity, emotional tone, toxicity, and harm. We demonstrate the utility of our approach by applying it to online communities centered on dieting and body image. We administer an eating disorder psychometric test to the aligned LLMs to reveal unhealthy beliefs and successfully differentiate communities with varying levels of eating disorder risk. Our results highlight the potential of LLMs in automated moderation and broader applications in public health and social science research.
Anthology ID:
2025.naacl-long.5
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
88–111
Language:
URL:
https://preview.aclanthology.org/Author-page-Marten-During-lu/2025.naacl-long.5/
DOI:
Bibkey:
Cite (ACL):
Minh Duc Chu, Zihao He, Rebecca Dorn, and Kristina Lerman. 2025. Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 88–111, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities (Chu et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Author-page-Marten-During-lu/2025.naacl-long.5.pdf