Mikhail Kopotev


Online Extraction of Russian Multiword Expressions
Mikhail Kopotev | Llorenç Escoter | Daria Kormacheva | Matthew Pierce | Lidia Pivovarova | Roman Yangarber
The 5th Workshop on Balto-Slavic Natural Language Processing


Automatic Detection of Stable Grammatical Features in N-Grams
Mikhail Kopotev | Lidia Pivovarova | Natalia Kochetkova | Roman Yangarber
Proceedings of the 9th Workshop on Multiword Expressions


Designing and Evaluating a Russian Tagset
Serge Sharoff | Mikhail Kopotev | Tomaž Erjavec | Anna Feldman | Dagmar Divjak
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper reports the principles behind designing a tagset to cover Russian morphosyntactic phenomena, modifications of the core tagset, and its evaluation. The tagset is based on the MULTEXT-East framework, while the decisions in designing it were aimed at achieving a balance between parameters important for linguists and the possibility to detect and disambiguate them automatically. The final tagset contains about 500 tags and achieves about 95% accuracy on the disambiguated portion of the Russian National Corpus. We have also produced a test set that can be shared with other researchers.