Clare Arrington


2025

pdf bib
ConShift: Sense-based Language Variation Analysis using Flexible Alignment
Clare Arrington | Mauricio Gruppi | Sibel Adali
Findings of the Association for Computational Linguistics: NAACL 2025

We introduce ConShift, a family of alignment-based algorithms that enable semantic variation analysis at the sense-level. Using independent senses of words induced from the context of tokens in two corpora, sense-enriched word embeddings are aligned using self-supervision and a flexible matching mechanism. This approach makes it possible to test for multiple sense-level language variations such as sense gain/presence, loss/absence and broadening/narrowing, while providing explanation of the changes through visualization of related concepts. We illustrate the utility of the method with sense- and word-level semantic shift detection results for multiple evaluation datasets in diachronic settings and dialect variation in the synchronic setting.