Abstract
Understanding idioms is important in NLP. In this paper, we study to what extent pre-trained BERT model can encode the meaning of a potentially idiomatic expression (PIE) in a certain context. We make use of a few existing datasets and perform two probing tasks: PIE usage classification and idiom paraphrase identification. Our experiment results suggest that BERT indeed can separate the literal and idiomatic usages of a PIE with high accuracy. It is also able to encode the idiomatic meaning of a PIE to some extent.- Anthology ID:
- 2021.ranlp-1.156
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1397–1407
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.156
- DOI:
- Cite (ACL):
- Minghuan Tan and Jing Jiang. 2021. Does BERT Understand Idioms? A Probing-Based Empirical Study of BERT Encodings of Idioms. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1397–1407, Held Online. INCOMA Ltd..
- Cite (Informal):
- Does BERT Understand Idioms? A Probing-Based Empirical Study of BERT Encodings of Idioms (Tan & Jiang, RANLP 2021)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2021.ranlp-1.156.pdf