Heuristically Informed Unsupervised Idiom Usage Recognition

Changsheng Liu, Rebecca Hwa


Abstract
Many idiomatic expressions can be interpreted figuratively or literally depending on their contexts. This paper proposes an unsupervised learning method for recognizing the intended usages of idioms. We treat the usages as a latent variable in probabilistic models and train them in a linguistically motivated feature space. Crucially, we show that distributional semantics is a helpful heuristic for distinguishing the literal usage of idioms, giving us a way to formulate a literal usage metric to estimate the likelihood that the idiom is intended literally. This information then serves as a form of distant supervision to guide the unsupervised training process for the probabilistic models. Experiments show that our overall model performs competitively against supervised methods.
Anthology ID:
D18-1199
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1723–1731
Language:
URL:
https://aclanthology.org/D18-1199
DOI:
10.18653/v1/D18-1199
Bibkey:
Cite (ACL):
Changsheng Liu and Rebecca Hwa. 2018. Heuristically Informed Unsupervised Idiom Usage Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1723–1731, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Heuristically Informed Unsupervised Idiom Usage Recognition (Liu & Hwa, EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/D18-1199.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-2/D18-1199.mp4