Exploring Topological Invariance in Semantic Embeddings

Fangzhou Gao, Justin Brody


Abstract
We present the result of preliminary explorations of using the topology of embedded manifolds as a semantic invariant. Our main question is whether the topology of large embedded corpora is invariant in the following two senses. First, one might reasonably expect that the same corpus in two languages would give topologically equivalent embeddings. Second, one might reasonably expect that the same corpus embedded by two different embedding models might give topologically equivalent embeddings. In the paper we will justify these intuitions and give preliminary results indicating that they are, to some extent, justified.
Anthology ID:
2026.nlp4dh-1.29
Volume:
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities
Month:
July
Year:
2026
Address:
San Diego, USA
Editors:
Sil Hamilton, Emily Öhman, Rebecca M. M. Hicke, Yuri Bizzoni, Axel Bax, Jacob A. Matthews, Mika Hämäläinen
Venues:
NLP4DH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
320–324
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.nlp4dh-1.29/
DOI:
Bibkey:
Cite (ACL):
Fangzhou Gao and Justin Brody. 2026. Exploring Topological Invariance in Semantic Embeddings. In Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities, pages 320–324, San Diego, USA. Association for Computational Linguistics.
Cite (Informal):
Exploring Topological Invariance in Semantic Embeddings (Gao & Brody, NLP4DH 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.nlp4dh-1.29.pdf