Abstract
In this paper, we address the problem of finding a novel document descriptor based on the covariance matrix of the word vectors of a document. Our descriptor has a fixed length, which makes it easy to use in many supervised and unsupervised applications. We tested our novel descriptor in different tasks including supervised and unsupervised settings. Our evaluation shows that our document covariance descriptor fits different tasks with competitive performance against state-of-the-art methods.- Anthology ID:
- P18-2084
- Volume:
- Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Iryna Gurevych, Yusuke Miyao
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 527–532
- Language:
- URL:
- https://aclanthology.org/P18-2084
- DOI:
- 10.18653/v1/P18-2084
- Cite (ACL):
- Marwan Torki. 2018. A Document Descriptor using Covariance of Word Vectors. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 527–532, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- A Document Descriptor using Covariance of Word Vectors (Torki, ACL 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/P18-2084.pdf
- Data
- IMDb Movie Reviews, SICK