Abstract
This article describes a dependency-based strategy that uses compositional distributional semantics and cross-lingual word embeddings to translate multiword expressions (MWEs). Our unsupervised approach performs translation as a process of word contextualization by taking into account lexico-syntactic contexts and selectional preferences. This strategy is suited to translate phraseological combinations and phrases whose constituent words are lexically restricted by each other. Several experiments in adjective-noun and verb-object compounds show that mutual contextualization (co-compositionality) clearly outperforms other compositional methods. The paper also contributes with a new freely available dataset of English-Spanish MWEs used to validate the proposed compositional strategy.- Anthology ID:
- W19-5106
- Volume:
- Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Agata Savary, Carla Parra Escartín, Francis Bond, Jelena Mitrović, Verginica Barbu Mititelu
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 40–48
- Language:
- URL:
- https://aclanthology.org/W19-5106
- DOI:
- 10.18653/v1/W19-5106
- Cite (ACL):
- Pablo Gamallo and Marcos Garcia. 2019. Unsupervised Compositional Translation of Multiword Expressions. In Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), pages 40–48, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Unsupervised Compositional Translation of Multiword Expressions (Gamallo & Garcia, MWE 2019)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/W19-5106.pdf