MASP: A Multilingual Dataset for Probing Scalar Modifier Understanding in LLMs

Xinyu Gao; Nai Ding; Wei Liu

MASP: A Multilingual Dataset for Probing Scalar Modifier Understanding in LLMs

Abstract

"This study aims to test how large language models (LLMs) understand gradable adjectives and whether their understanding compares with humans, under the framework of formal semantics.We introduce a diagnostic dataset, referred to as the Modifier-Adjective Scale Probe (MASP),to evaluate how well LLMs understand a gradable adjective (e.g., long) when the adjective is combined with one modifier (e.g., very long or slightly long, a condition referred to as degree modification) or is further negated (e.g., very not long and not very long, a condition referred to as compositional negation). The dataset consists of over 80,000 natural language inference questions in both Chinese and English. We apply the MASP dataset to test both humans and11 popular LLMs, including GPT-4o and Gemini-2.0-Flash. The results show that most LLMscan correctly understand whether a modifier boosts (e.g., very) an adjective. However, they fail to understand the modifiers that weaken the degree and the negation forms of modifiers.Furthermore, we parameterize the human and LLM behavior, and find that the judgment patterns of LLMs differ from humans especially in the Chinese tests. These findings suggest that LLM sare still not well aligned with humans in terms of the interpretation of simple adjective phrases,and MASP provides a new approach to quantify the interpretation of adjective phrases in LLMs."

Anthology ID:: 2025.ccl-1.76
Volume:: Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:: August
Year:: 2025
Address:: Jinan, China
Editors:: Maosong Sun, Peiyong Duan, Zhiyuan Liu, Ruifeng Xu, Weiwei Sun
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 1003–1019
Language:
URL:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.76/
DOI:
Bibkey:
Cite (ACL):: Xinyu Gao, Nai Ding, and Wei Liu. 2025. MASP: A Multilingual Dataset for Probing Scalar Modifier Understanding in LLMs. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 1003–1019, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):: MASP: A Multilingual Dataset for Probing Scalar Modifier Understanding in LLMs (Gao et al., CCL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.76.pdf

PDF Cite Search Fix data