Current Semantic-change Quantification Methods Struggle with Semantic Change Discovery in the Wild

Khonzoda Umarova, Lillian Lee, Laerdon Kim


Abstract
Methods for lexical semantic-change detection quantify changes in the meaning of words over time. Prior methods have excelled on established benchmarks consisting of pre-selected target words, chosen ahead of time due to the prohibitive cost of manually annotating all words. However, performance measured on small curated wordsets cannot reveal how well these methods perform at discovering semantic changes among the full corpus vocabulary, which is the actual end goal for many applications.In this paper, we implement a top-k setup to evaluate semantic-change discovery despite lacking complete annotations. (At the same time, we also extend the annotations in the commonly used LiverpoolFC and SemEval-EN benchmarks by 85% and 90%, respectively). We deploy our evaluation setup on a battery of semantic-change detection methods under multiple variations.We find that when presented with a natural distribution of instances, all the methods struggle at ranking known large changes higher than other words in the vocabulary. Furthermore, we manually verify that the majority of words with high detected-change scores in LiverpoolFC do not actually experience meaning changes. In fact, for most of the methods, less than a half of the highest-ranked changes were determined to have changed in meaning. Given the large performance discrepancies between existing benchmark results and discovery “in the wild”, we recommend that researchers direct more attention to semantic-change discovery and include it in their suite of evaluations. Our annotations and code for running evaluations are available at https://github.com/khonzoda/semantic-change-discovery-emnlp2025.
Anthology ID:
2025.emnlp-main.1791
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
35342–35355
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1791/
DOI:
Bibkey:
Cite (ACL):
Khonzoda Umarova, Lillian Lee, and Laerdon Kim. 2025. Current Semantic-change Quantification Methods Struggle with Semantic Change Discovery in the Wild. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 35342–35355, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Current Semantic-change Quantification Methods Struggle with Semantic Change Discovery in the Wild (Umarova et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1791.pdf
Checklist:
 2025.emnlp-main.1791.checklist.pdf