ConShift: Sense-based Language Variation Analysis using Flexible Alignment

Clare Arrington, Mauricio Gruppi, Sibel Adali


Abstract
We introduce ConShift, a family of alignment-based algorithms that enable semantic variation analysis at the sense-level. Using independent senses of words induced from the context of tokens in two corpora, sense-enriched word embeddings are aligned using self-supervision and a flexible matching mechanism. This approach makes it possible to test for multiple sense-level language variations such as sense gain/presence, loss/absence and broadening/narrowing, while providing explanation of the changes through visualization of related concepts. We illustrate the utility of the method with sense- and word-level semantic shift detection results for multiple evaluation datasets in diachronic settings and dialect variation in the synchronic setting.
Anthology ID:
2025.findings-naacl.9
Volume:
Findings of the Association for Computational Linguistics: NAACL 2025
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
167–181
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.9/
DOI:
Bibkey:
Cite (ACL):
Clare Arrington, Mauricio Gruppi, and Sibel Adali. 2025. ConShift: Sense-based Language Variation Analysis using Flexible Alignment. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 167–181, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
ConShift: Sense-based Language Variation Analysis using Flexible Alignment (Arrington et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.9.pdf