Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection

Dylan Phelps, Rodrigo Wilkens, Edward Gow-Smith, Thomas M. R. Pickard, Maggie Mi, Marco Idiart, Aline Villavicencio


Abstract
The recent trend towards utilisation of reasoning models has improved the performance of Large Language Models (LLMs) across many tasks which involve logical steps. One linguistic task that could benefit from this framing is idiomaticity detection, as a potentially idiomatic expression must first be understood in relation to the context before it can be disambiguated. In this paper, we explore how reasoning capabilities in LLMs affect idiomaticity detection performance and examine the effect of model size. We evaluate, as open source representative models, the suite of DeepSeek-R1 distillation models ranging from 1.5B to 70B parameters across four idiomaticity detection datasets. We find the effect of reasoning to be smaller and more varied than expected. For smaller models, producing chain-of-thought (CoT) reasoning increases performance from Math-tuned intermediate models, but not to the levels of the base models, whereas larger models (14B, 32B, and 70B) show modest improvements. Our in-depth analyses reveal that larger models demonstrate good understanding of idiomaticity, successfully producing accurate definitions of expressions, while smaller models often fail to output the actual meaning. For this reason, we also experiment with providing definitions in the prompts of smaller models, which we show can improve performance in some cases.
Anthology ID:
2026.lrec-main.419
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
5367–5376
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.419/
DOI:
Bibkey:
Cite (ACL):
Dylan Phelps, Rodrigo Wilkens, Edward Gow-Smith, Thomas M. R. Pickard, Maggie Mi, Marco Idiart, and Aline Villavicencio. 2026. Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection. International Conference on Language Resources and Evaluation, main:5367–5376.
Cite (Informal):
Stands to Reason: Investigating the Effect of Reasoning on Idiomaticity Detection (Phelps et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.419.pdf