LLM-Human Pipeline for Cultural Grounding of Conversations

Rajkumar Pujari, Dan Goldwasser


Abstract
Conversations often adhere to well-understood social norms that vary across cultures. For example, while addressing parents by name is commonplace in the West, it is rare in most Asian cultures. Adherence or violation of such norms often dictates the tenor of conversations. Humans are able to navigate social situations requiring cultural awareness quite adeptly. However, it is a hard task for NLP models.In this paper, we tackle this problem by introducing a Cultural Context Schema for conversations. It comprises (1) conversational information such as emotions, dialogue acts, etc., and (2) cultural information such as social norms, violations, etc. We generate ~110k social norm and violation descriptions for ~23k conversations from Chinese culture using LLMs. We refine them using automated verification strategies which are evaluated against culturally aware human judgements. We organize these descriptions into meaningful structures we call Norm Concepts, using an interactive human-in-loop framework. We ground the norm concepts and the descriptions in conversations using symbolic annotation. Finally, we use the obtained dataset for downstream tasks such as emotion, sentiment, and dialogue act detection. We show that it significantly improves the empirical performance.
Anthology ID:
2025.naacl-long.48
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1029–1048
Language:
URL:
https://preview.aclanthology.org/moar-dois/2025.naacl-long.48/
DOI:
10.18653/v1/2025.naacl-long.48
Bibkey:
Cite (ACL):
Rajkumar Pujari and Dan Goldwasser. 2025. LLM-Human Pipeline for Cultural Grounding of Conversations. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1029–1048, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
LLM-Human Pipeline for Cultural Grounding of Conversations (Pujari & Goldwasser, NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/moar-dois/2025.naacl-long.48.pdf