Abstract
We introspect black-box sentence embeddings by conditionally generating from them with the objective to retrieve the underlying discrete sentence. We perceive of this as a new unsupervised probing task and show that it correlates well with downstream task performance. We also illustrate how the language generated from different encoders differs. We apply our approach to generate sentence analogies from sentence embeddings.- Anthology ID:
- 2020.coling-main.152
- Volume:
- Proceedings of the 28th International Conference on Computational Linguistics
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Donia Scott, Nuria Bel, Chengqing Zong
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 1729–1736
- Language:
- URL:
- https://aclanthology.org/2020.coling-main.152
- DOI:
- 10.18653/v1/2020.coling-main.152
- Cite (ACL):
- Martin Kerscher and Steffen Eger. 2020. Vec2Sent: Probing Sentence Embeddings with Natural Language Generation. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1729–1736, Barcelona, Spain (Online). International Committee on Computational Linguistics.
- Cite (Informal):
- Vec2Sent: Probing Sentence Embeddings with Natural Language Generation (Kerscher & Eger, COLING 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.coling-main.152.pdf
- Code
- maruker/vec2sent
- Data
- SentEval