Fangzhou Gao


2026

We present the result of preliminary explorations of using the topology of embedded manifolds as a semantic invariant. Our main question is whether the topology of large embedded corpora is invariant in the following two senses. First, one might reasonably expect that the same corpus in two languages would give topologically equivalent embeddings. Second, one might reasonably expect that the same corpus embedded by two different embedding models might give topologically equivalent embeddings. In the paper we will justify these intuitions and give preliminary results indicating that they are, to some extent, justified.