Investigating noun-noun compound relation representations in autoregressive large language models

Saffron Kendrick, Mark Ormerod, Hui Wang, Barry Devereux


Abstract
This paper uses autoregressive large language models to explore at which points in a given input sentence the semantic information is decodable. Using representational similarity analysis and probing, the results show that autoregressive models are capable of extracting the semantic relation information from a dataset of noun-noun compounds. When considering the effect of processing the head and modifier nouns in context, the extracted representations show greater correlation after processing both constituent nouns in the same sentence. The linguistic properties of the head nouns may influence the ability of LLMs to extract relation information when the head and modifier words are processed separately. Probing suggests that Phi-1 and LLaMA-3.2 are exposed to relation information during training, as they are able to predict the relation vectors for compounds from separate word representations to a similar degree as using compositional compound representations. However, the difference in processing condition for GPT-2 and DeepSeek-R1 indicates that these models are actively processing the contextual semantic relation information of the compound.
Anthology ID:
2025.cmcl-1.30
Volume:
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico, USA
Editors:
Tatsuki Kuribayashi, Giulia Rambelli, Ece Takmaz, Philipp Wicke, Jixing Li, Byung-Doh Oh
Venues:
CMCL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
253–263
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.cmcl-1.30/
DOI:
Bibkey:
Cite (ACL):
Saffron Kendrick, Mark Ormerod, Hui Wang, and Barry Devereux. 2025. Investigating noun-noun compound relation representations in autoregressive large language models. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pages 253–263, Albuquerque, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Investigating noun-noun compound relation representations in autoregressive large language models (Kendrick et al., CMCL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.cmcl-1.30.pdf