Maria Ponomareva

2021

pdf abs
Zhestyatsky at SemEval-2021 Task 2: ReLU over Cosine Similarity for BERT Fine-tuning
Boris Zhestiankin | Maria Ponomareva
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper presents our contribution to SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC). Our experiments cover English (EN-EN) sub-track from the multilingual setting of the task. We experiment with several pre-trained language models and investigate an impact of different top-layers on fine-tuning. We find the combination of Cosine Similarity and ReLU activation leading to the most effective fine-tuning procedure. Our best model results in accuracy 92.7%, which is the fourth-best score in EN-EN sub-track.

2019

pdf abs
Char-RNN for Word Stress Detection in East Slavic Languages
Ekaterina Chernyak | Maria Ponomareva | Kirill Milintsevich
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects

We explore how well a sequence labeling approach, namely, recurrent neural network, is suited for the task of resource-poor and POS tagging free word stress detection in the Russian, Ukranian, Belarusian languages. We present new datasets, annotated with the word stress, for the three languages and compare several RNN models trained on three languages and explore possible applications of the transfer learning for the task. We show that it is possible to train a model in a cross-lingual setting and that using additional languages improves the quality of the results.

pdf abs
AGRR 2019: Corpus for Gapping Resolution in Russian
Maria Ponomareva | Kira Droganova | Ivan Smurov | Tatiana Shavrina
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing

This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a competition aimed at stimulating the development of NLP tools and methods for processing of ellipsis. In this paper, we pay special attention to the gapping resolution methods that were introduced within the shared task as well as an alternative test set that illustrates that our corpus is a diverse and representative subset of Russian language gapping sufficient for effective utilization of machine learning techniques.

2017

pdf abs
Automated Word Stress Detection in Russian
Maria Ponomareva | Kirill Milintsevich | Ekaterina Chernyak | Anatoly Starostin
Proceedings of the First Workshop on Subword and Character Level Models in NLP

In this study we address the problem of automated word stress detection in Russian using character level models and no part-speech-taggers. We use a simple bidirectional RNN with LSTM nodes and achieve accuracy of 90% or higher. We experiment with two training datasets and show that using the data from an annotated corpus is much more efficient than using only a dictionary, since it allows to retain the context of the word and its morphological features.

Co-authors

Boris Zhestiankin 1

Anatoli Starostin 1