Abstract
Idioms such as “call it a day” and “piece of cake,” are prevalent in natural language. How do Transformer language models process idioms? This study examines this question by analysing three models - BERT, Multilingual BERT, and DistilBERT. We compare the embeddings of idiomatic and literal expressions across all layers of the networks at both the sentence and word levels. Additionally, we investigate the attention directed from other sentence tokens towards a word within an idiom as opposed to in a literal context. Results indicate that while the three models exhibit slightly different internal mechanisms, they all represent idioms distinctively compared to literal language, with attention playing a critical role. These findings suggest that idioms are semantically and syntactically idiosyncratic, not only for humans but also for language models.- Anthology ID:
- 2023.starsem-1.16
- Volume:
- Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Alexis Palmer, Jose Camacho-collados
- Venue:
- *SEM
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 174–179
- Language:
- URL:
- https://aclanthology.org/2023.starsem-1.16
- DOI:
- 10.18653/v1/2023.starsem-1.16
- Cite (ACL):
- Ye Tian, Isobel James, and Hye Son. 2023. How Are Idioms Processed Inside Transformer Language Models?. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 174–179, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- How Are Idioms Processed Inside Transformer Language Models? (Tian et al., *SEM 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2023.starsem-1.16.pdf