Abstract
This paper studies the generation methods for paraphrasing in the Russian language. There are several transformer-based models (Russian and multilingual) trained on a collected corpus of paraphrases. We compare different models, contrast the quality of paraphrases using different ranking methods and apply paraphrasing methods in the context of augmentation procedure for different tasks. The contributions of the work are the combined paraphrasing dataset, fine-tuned generated models for Russian paraphrasing task and additionally the open source tool for simple usage of the paraphrasers.- Anthology ID:
- 2021.bsnlp-1.2
- Volume:
- Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing
- Month:
- April
- Year:
- 2021
- Address:
- Kiyv, Ukraine
- Venue:
- BSNLP
- SIG:
- SIGSLAV
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11–19
- Language:
- URL:
- https://aclanthology.org/2021.bsnlp-1.2
- DOI:
- Cite (ACL):
- Alena Fenogenova. 2021. Russian Paraphrasers: Paraphrase with Transformers. In Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing, pages 11–19, Kiyv, Ukraine. Association for Computational Linguistics.
- Cite (Informal):
- Russian Paraphrasers: Paraphrase with Transformers (Fenogenova, BSNLP 2021)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/2021.bsnlp-1.2.pdf
- Code
- sberbank-ai/ru-gpts + additional community code
- Data
- DaNetQA, TERRa