If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions
Caroline Pasquer, Agata Savary, Carlos Ramisch, Jean-Yves Antoine
Abstract
Multiword expressions, especially verbal ones (VMWEs), show idiosyncratic variability, which is challenging for NLP applications, hence the need for VMWE identification. We focus on the task of variant identification, i.e. identifying variants of previously seen VMWEs, whatever their surface form. We model the problem as a classification task. Syntactic subtrees with previously seen combinations of lemmas are first extracted, and then classified on the basis of features relevant to morpho-syntactic variation of VMWEs. Feature values are both absolute, i.e. hold for a particular VMWE candidate, and relative, i.e. based on comparing a candidate with previously seen VMWEs. This approach outperforms a baseline by 4 percent points of F-measure on a French corpus.- Anthology ID:
- C18-1219
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2582–2594
- Language:
- URL:
- https://aclanthology.org/C18-1219
- DOI:
- Cite (ACL):
- Caroline Pasquer, Agata Savary, Carlos Ramisch, and Jean-Yves Antoine. 2018. If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2582–2594, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions (Pasquer et al., COLING 2018)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/C18-1219.pdf