Mateusz Lango


2023

pdf
Alexander Knox at SemEval-2023 Task 5: The comparison of prompting and standard fine-tuning techniques for selecting the type of spoiler needed to neutralize a clickbait
Mateusz Woźny | Mateusz Lango
Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)

Clickbait posts are a common problem on social media platforms, as they often deceive users by providing misleading or sensational headlines that do not match the content of the linked web page. The aim of this study is to create a technique for identifying the specific type of suitable spoiler - be it a phrase, a passage, or a multipart spoiler - needed to neutralize clickbait posts. This is achieved by developing a machine learning classifier analyzing both the clickbait post and the linked web page.Modern approaches for constructing a text classifier usually rely on fine-tuning a transformer-based model pre-trained on large unsupervised corpora. However, recent advances in the development of large-scale language models have led to the emergence of a new transfer learning paradigm based on prompt engineering.In this work, we study these two transfer learning techniques and compare their effectiveness for clickbait spoiler-type detection task.Our experimental results show that for this task, using the standard fine-tuning method gives better results than using prompting. The best model can achieve a similar performance to that presented by Hagen et al. (2022).

2020

pdf
A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping
Kamil Pluciński | Mateusz Lango | Michał Zimniewicz
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this work, we study the unsupervised cross-lingual word embeddings mapping method presented by Artetxe et al. (2018). First, wesuccessfully reproduced the experiments performed in the original work, finding only minor differences. Furthermore, we verified themethod’s robustness on different embedding representations and new language pairs, particularly these involving Slavic languages likePolish or Czech. We also performed an experimental analysis of the impact of the method’s parameters on the final result. Finally, welooked for an alternative way of initialization, which directly relies on the isometric assumption. Our work confirms the results presentedearlier, at the same time pointing at interesting problems occurring while using the method with different types of embeddings or onless-common language pairs.

2018

pdf
Semi-Automatic Construction of Word-Formation Networks (for Polish and Spanish)
Mateusz Lango | Magda Ševčíková | Zdeněk Žabokrtský
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf
PUT at SemEval-2016 Task 4: The ABC of Twitter Sentiment Analysis
Mateusz Lango | Dariusz Brzezinski | Jerzy Stefanowski
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)