Abstract
Many idiomatic expressions can be interpreted figuratively or literally depending on their contexts. This paper proposes an unsupervised learning method for recognizing the intended usages of idioms. We treat the usages as a latent variable in probabilistic models and train them in a linguistically motivated feature space. Crucially, we show that distributional semantics is a helpful heuristic for distinguishing the literal usage of idioms, giving us a way to formulate a literal usage metric to estimate the likelihood that the idiom is intended literally. This information then serves as a form of distant supervision to guide the unsupervised training process for the probabilistic models. Experiments show that our overall model performs competitively against supervised methods.- Anthology ID:
- D18-1199
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1723–1731
- Language:
- URL:
- https://aclanthology.org/D18-1199
- DOI:
- 10.18653/v1/D18-1199
- Cite (ACL):
- Changsheng Liu and Rebecca Hwa. 2018. Heuristically Informed Unsupervised Idiom Usage Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1723–1731, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Heuristically Informed Unsupervised Idiom Usage Recognition (Liu & Hwa, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/D18-1199.pdf