Abstract
We evaluated various compositional models, from bag-of-words representations to compositional RNN-based models, on several extrinsic supervised and unsupervised evaluation benchmarks. Our results confirm that weighted vector averaging can outperform context-sensitive models in most benchmarks, but structural features encoded in RNN models can also be useful in certain classification tasks. We analyzed some of the evaluation datasets to identify the aspects of meaning they measure and the characteristics of the various models that explain their performance variance.- Anthology ID:
- C18-1226
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2666–2677
- Language:
- URL:
- https://aclanthology.org/C18-1226
- DOI:
- Cite (ACL):
- Hanan Aldarmaki and Mona Diab. 2018. Evaluation of Unsupervised Compositional Representations. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2666–2677, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Evaluation of Unsupervised Compositional Representations (Aldarmaki & Diab, COLING 2018)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/C18-1226.pdf
- Code
- h-aldarmaki/sentence_eval
- Data
- IMDb Movie Reviews, MPQA Opinion Corpus, SICK