Ľuboš Kriš
2025
SlovakBabyLM: Replication of the BabyLM and Sample-efficient Pretraining for a Low-Resource Language
Ľuboš Kriš
|
Marek Suppa
Proceedings of the First BabyLM Workshop
In recent years, we can observe a trend of creating various specific language models (LMs) within the Slavic language family with the Bert architecture. However, with an increasing number of parameters of LM, a larger amount of text is required for good performance, which can hinder the development and creation of LMs for specific languages. Our research is looking for a solution in Curriculum learning(CL) methods that can help us build better models with a lower amount of text in comparison with current LMs, which can help in better prtraining of models with low resource languages(LRL). Therefore, we replicate the BabyLM Challenge in the Slovak language (Dataset: https://huggingface.co/datasets/ubokri/SlovakBabyLM, Code: https://github.com/baucek/Slovakbabylm/tree/main). Additionally, apply CL to test and see the difference in the application of CL methods on the English and Slovak languages and evaluate whether the CL improves performance of LM. Our experiments show that the use of CL methods as preprocessing methods is significant for improving model performance in sentiment analysis and question answering.
o-MEGA: Optimized Methods for Explanation Generation and Analysis
Ľuboš Kriš
|
Jaroslav Kopčan
|
Qiwei Peng
|
Andrej Ridzik
|
Marcel Veselý
|
Martin Tamajka
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
The proliferation of transformer-based language models has revolutionized NLP domain while simultaneously introduced significant challenges regarding model transparency and trustworthiness. The complexity of achieving explainable systems in this domain is evidenced by the extensive array of explanation methods and evaluation metrics developed by researchers. To address the challenge of selecting optimal explainability approaches, we present o-mega, a hyperparameter optimization tool designed to automatically identify the most effective explainable AI methods and their configurations within the semantic matching domain. We evaluate o-mega on a post-claim matching pipeline using a curated dataset of social media posts paired with refuting claims. Our tool systematically explores different explainable methods and their hyperparameters, demonstrating improved transparency in automated fact-checking systems. As a result, such automated optimization of explanation methods can significantly enhance the interpretability of claim-matching models in critical applications such as misinformation detection, contributing to more trustworthy and transparent AI systems.
Search
Fix author
Co-authors
- Jaroslav Kopčan 1
- Qiwei Peng 1
- Andrej Ridzik 1
- Martin Tamajka 1
- Marcel Veselý 1
- show all...