Anna Mosolova


2024

pdf
Injecting Wiktionary to improve token-level contextual representations using contrastive learning
Anna Mosolova | Marie Candito | Carlos Ramisch
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)

While static word embeddings are blind to context, for lexical semantics tasks context is rather too present in contextual word embeddings, vectors of same-meaning occurrences being too different (Ethayarajh, 2019). Fine-tuning pre-trained language models (PLMs) using contrastive learning was proposed, leveraging automatically self-augmented examples (Liu et al., 2021b). In this paper, we investigate how to inject a lexicon as an alternative source of supervision, using the English Wiktionary. We also test how dimensionality reduction impacts the resulting contextual word embeddings. We evaluate our approach on the Word-In-Context (WiC) task, in the unsupervised setting (not using the training set). We achieve new SoTA result on the original WiC test set. We also propose two new WiC test sets for which we show that our fine-tuning method achieves substantial improvements. We also observe improvements, although modest, for the semantic frame induction task. Although we experimented on English to allow comparison with related work, our method is adaptable to the many languages for which large Wiktionaries exist.

2022

pdf
The Only Chance to Understand: Machine Translation of the Severely Endangered Low-resource Languages of Eurasia
Anna Mosolova | Kamel Smaili
Proceedings of the Fifth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2022)

Numerous machine translation systems have been proposed since the appearance of this task. Nowadays, new large language model-based algorithms show results that sometimes overcome human ones on the rich-resource languages. Nevertheless, it is still not the case for the low-resource languages, for which all these algorithms did not show equally impressive results. In this work, we want to compare 3 generations of machine translation models on 7 low-resource languages and make a step further by proposing a new way of automatic parallel data augmentation using the state-of-the-art generative model.

2018

pdf
Conditional Random Fields for Metaphor Detection
Anna Mosolova | Ivan Bondarenko | Vadim Fomin
Proceedings of the Workshop on Figurative Language Processing

We present an algorithm for detecting metaphor in sentences which was used in Shared Task on Metaphor Detection by First Workshop on Figurative Language Processing. The algorithm is based on different features and Conditional Random Fields.