Martin Tamajka
2022
SlovakBERT: Slovak Masked Language Model
Matúš Pikuliak
|
Štefan Grivalský
|
Martin Konôpka
|
Miroslav Blšták
|
Martin Tamajka
|
Viktor Bachratý
|
Marian Simko
|
Pavol Balážik
|
Michal Trnka
|
Filip Uhlárik
Findings of the Association for Computational Linguistics: EMNLP 2022
We introduce a new Slovak masked language model called SlovakBERT. This is to our best knowledge the first paper discussing Slovak transformers-based language models. We evaluate our model on several NLP tasks and achieve state-of-the-art results. This evaluation is likewise the first attempt to establish a benchmark for Slovak language models. We publish the masked language model, as well as the fine-tuned models for part-of-speech tagging, sentiment analysis and semantic textual similarity.
Search
Co-authors
- Matúš Pikuliak 1
- Štefan Grivalský 1
- Martin Konôpka 1
- Miroslav Blšták 1
- Viktor Bachratý 1
- show all...