Fangzhou Gao
2026
Exploring Topological Invariance in Semantic Embeddings
Fangzhou Gao | Justin Brody
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities
Fangzhou Gao | Justin Brody
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities
We present the result of preliminary explorations of using the topology of embedded manifolds as a semantic invariant. Our main question is whether the topology of large embedded corpora is invariant in the following two senses. First, one might reasonably expect that the same corpus in two languages would give topologically equivalent embeddings. Second, one might reasonably expect that the same corpus embedded by two different embedding models might give topologically equivalent embeddings. In the paper we will justify these intuitions and give preliminary results indicating that they are, to some extent, justified.