“Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs

Eojin Jeon; Mingyu Lee; Sangyun Kim; Junho Kim; Wanzee Cho; Tae-Eui Kam; SangKeun Lee

doi:10.18653/v1/2025.findings-emnlp.802

“Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs

Eojin Jeon, Mingyu Lee, Sangyun Kim, Junho Kim, Wanzee Cho, Tae-Eui Kam, SangKeun Lee

Abstract

Emotion context sensitivity—the ability to adjust emotional responses based on contexts—is a core component of human emotional intelligence. For example, being told, “You can come with me if you want,” may elicit joy if the destination is a mall, but provoke fear if the destination is a trap house. As large language models (LLMs) are increasingly deployed in socially interactive settings, understanding this human ability becomes crucial for generating context-appropriate, emotion-aware responses. In this work, we introduce Trace, a novel benchmark for evaluating whether LLMs can understand emotion context sensitivity of humans. This benchmark consists of 1,626 social scenarios and comprises two complementary tests: a sensitivity test, which measures whether models can detect emotional shifts caused by context changes, and a robustness test, which evaluates whether models can maintain stable emotion predictions when context changes are emotionally irrelevant. Each scenario pair keeps the core event constant while systematically varying contextual details—time, place, or agent—based on insights from behavioral theory and emotion psychology. Experimental results show that even the best-performing LLMs lag behind human performance by 20% in the sensitivity test and 15% in the robustness test, indicating substantial room for improvement in emotion-aware reasoning.

Anthology ID:: 2025.findings-emnlp.802
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 14848–14869
Language:
URL:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.802/
DOI:: 10.18653/v1/2025.findings-emnlp.802
Bibkey:
Cite (ACL):: Eojin Jeon, Mingyu Lee, Sangyun Kim, Junho Kim, Wanzee Cho, Tae-Eui Kam, and SangKeun Lee. 2025. “Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 14848–14869, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: “Going to a trap house” conveys more fear than “Going to a mall”: Benchmarking Emotion Context Sensitivity for LLMs (Jeon et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.802.pdf
Checklist:: 2025.findings-emnlp.802.checklist.pdf

PDF Cite Search Checklist Fix data