Mariia Fedorova


2024

pdf
Definition generation for lexical semantic change detection
Mariia Fedorova | Andrey Kutuzov | Yves Scherrer
Findings of the Association for Computational Linguistics: ACL 2024

We use contextualized word definitions generated by large language models as semantic representations in the task of diachronic lexical semantic change detection (LSCD). In short, generated definitions are used as ‘senses’, and the change score of a target word is retrieved by comparing their distributions in two time periods under comparison. On the material of five datasets and three languages, we show that generated definitions are indeed specific and general enough to convey a signal sufficient to rank sets of words by the degree of their semantic change over time. Our approach is on par with or outperforms prior non-supervised sense-based LSCD methods. At the same time, it preserves interpretability and allows to inspect the reasons behind a specific shift in terms of discrete definitions-as-senses. This is another step in the direction of explainable semantic change modeling.

pdf
AXOLOTL’24 Shared Task on Multilingual Explainable Semantic Change Modeling
Mariia Fedorova | Timothee Mickus | Niko Partanen | Janine Siewert | Elena Spaziani | Andrey Kutuzov
Proceedings of the 5th Workshop on Computational Approaches to Historical Language Change

pdf
Enriching Word Usage Graphs with Cluster Definitions
Andrey Kutuzov | Mariia Fedorova | Dominik Schlechtweg | Nikolay Arefyev
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We present a dataset of word usage graphs (WUGs), where the existing WUGs for multiple languages are enriched with cluster labels functioning as sense definitions. They are generated from scratch by fine-tuned encoder-decoder language models. The conducted human evaluation has shown that these definitions match the existing clusters in WUGs better than the definitions chosen from WordNet by two baseline systems. At the same time, the method is straightforward to use and easy to extend to new languages. The resulting enriched datasets can be extremely helpful for moving on to explainable semantic change modeling.