Abstract
In this paper, we perform a comparative evaluation of off-the-shelf embedding models over the task of compositionality prediction of multiword expressions("MWEs"). Our experimental results suggest that character- and document-level models capture knowledge of MWE compositionality and are effective in modelling varying levels of compositionality, with the advantage over word-level models that they do not require token-level identification of MWEs in the training corpus.- Anthology ID:
- U18-1009
- Volume:
- Proceedings of the Australasian Language Technology Association Workshop 2018
- Month:
- December
- Year:
- 2018
- Address:
- Dunedin, New Zealand
- Editors:
- Sunghwan Mac Kim, Xiuzhen (Jenny) Zhang
- Venue:
- ALTA
- SIG:
- Publisher:
- Note:
- Pages:
- 71–76
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/U18-1009/
- DOI:
- Cite (ACL):
- Navnita Nandakumar, Bahar Salehi, and Timothy Baldwin. 2018. A Comparative Study of Embedding Models in Predicting the Compositionality of Multiword Expressions. In Proceedings of the Australasian Language Technology Association Workshop 2018, pages 71–76, Dunedin, New Zealand.
- Cite (Informal):
- A Comparative Study of Embedding Models in Predicting the Compositionality of Multiword Expressions (Nandakumar et al., ALTA 2018)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/U18-1009.pdf