Michal Ptaszynski

2025

pdf bib abs
RBG-AI: Benefits of Multilingual Language Models for Low-Resource Languages
Barathi Ganesh Hb | Michal Ptaszynski
Proceedings of the Tenth Conference on Machine Translation

This paper investigates how multilingual language models benefit low-resource languages through our submission to the WMT 2025 Low-Resource Indic Language Translation shared task. We explore whether languages from related families can effectively support translation for low-resource languages that were absent or underrepresented during model training. Using a quantized multilingual pretrained foundation model, we examine zero-shot translation capabilities and cross-lingual transfer effects across three language families: Tibeto-Burman, Indo-Aryan, and Austroasiatic. Our findings demonstrate that multilingual models failed to leverage linguistic similarities, particularly evidenced within the Tibeto-Burman family. The study provides insights into the practical feasibility of zero-shot translation for low-resource language settings and the role of language family relationships in multilingual model performance.

2024

pdf bib abs
nowhash at SemEval-2024 Task 4: Exploiting Fusion of Transformers for Detecting Persuasion Techniques in Multilingual Memes
Abu Nowhash Chowdhury | Michal Ptaszynski
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Nowadays, memes are considered one of the most prominent forms of medium to disseminate information on social media. Memes are typically constructed in multilingual settings using visuals with texts. Sometimes people use memes to influence mass audiences through rhetorical and psychological techniques, such as causal oversimplification, name-calling, and smear. It is a challenging task to identify those techniques considering memes’ multimodal characteristics. To address these challenges, SemEval-2024 Task 4 introduced a shared task focusing on detecting persuasion techniques in multilingual memes. This paper presents our participation in subtasks 1 and 2(b). We use a finetuned language-agnostic BERT sentence embedding (LaBSE) model to extract effective contextual features from meme text to address the challenge of identifying persuasion techniques in subtask 1. For subtask 2(b), We finetune the vision transformer and XLM-RoBERTa to extract effective contextual information from meme image and text data. Finally, we unify those features and employ a single feed-forward linear layer on top to obtain the prediction label. Experimental results on the SemEval 2024 Task 4 benchmark dataset manifested the potency of our proposed methods for subtasks 1 and 2(b).

2023

pdf bib
Improving Polish to English Neural Machine Translation with Transfer Learning: Effects of Data Volume and Language Similarity
Juuso Eronen | Michal Ptaszynski | Karol Nowakowski | Zheng Lin Chia
Proceedings of the 1st International Workshop on Multilingual, Multimodal and Multitask Language Generation

pdf bib
Improving Low-Resource Speech Recognition through Multilingual Fine-Tuning with Language Identifiers and Self-Training
Karol Nowakowski | Michal Ptaszynski
Proceedings of the 35th Conference on Computational Linguistics and Speech Processing (ROCLING 2023)

2020

pdf bib
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations
Michal Ptaszynski | Bartosz Ziolko
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

pdf bib abs
Epistolary Education in 21st Century: A System to Support Composition of E-mails by Students to Superiors in Japanese
Kenji Ryu | Michal Ptaszynski
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

E-mail is a communication tool widely used by people of all ages on the Internet today, often in business and formal situations, especially in Japan. Moreover, Japanese E-mail communication has a set of specific rules taught using specialized guidebooks. E-mail literacy education for many Japanese students is typically provided in a traditional, yet inefficient lecture-based way. We propose a system to support Japanese students in writing E-mails to superiors (teachers, job hunting representatives, etc.). We firstly make an investigation into the importance of formal E-mails in Japan, and what is needed to successfully write a formal E-mail. Next, we develop the system with accordance to those rules. Finally, we evaluated the system twofold. The results, although performed on a small number of samples, were generally positive, and clearly indicated additional ways to improve the system.