Mateusz Lango

2023

pdf abs
Alexander Knox at SemEval-2023 Task 5: The comparison of prompting and standard fine-tuning techniques for selecting the type of spoiler needed to neutralize a clickbait
Mateusz Woźny | Mateusz Lango
Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)

Clickbait posts are a common problem on social media platforms, as they often deceive users by providing misleading or sensational headlines that do not match the content of the linked web page. The aim of this study is to create a technique for identifying the specific type of suitable spoiler - be it a phrase, a passage, or a multipart spoiler - needed to neutralize clickbait posts. This is achieved by developing a machine learning classifier analyzing both the clickbait post and the linked web page.Modern approaches for constructing a text classifier usually rely on fine-tuning a transformer-based model pre-trained on large unsupervised corpora. However, recent advances in the development of large-scale language models have led to the emergence of a new transfer learning paradigm based on prompt engineering.In this work, we study these two transfer learning techniques and compare their effectiveness for clickbait spoiler-type detection task.Our experimental results show that for this task, using the standard fine-tuning method gives better results than using prompting. The best model can achieve a similar performance to that presented by Hagen et al. (2022).

2020

pdf abs
A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping
Kamil Pluciński | Mateusz Lango | Michał Zimniewicz
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this work, we study the unsupervised cross-lingual word embeddings mapping method presented by Artetxe et al. (2018). First, wesuccessfully reproduced the experiments performed in the original work, finding only minor differences. Furthermore, we verified themethod’s robustness on different embedding representations and new language pairs, particularly these involving Slavic languages likePolish or Czech. We also performed an experimental analysis of the impact of the method’s parameters on the final result. Finally, welooked for an alternative way of initialization, which directly relies on the isometric assumption. Our work confirms the results presentedearlier, at the same time pointing at interesting problems occurring while using the method with different types of embeddings or onless-common language pairs.

Mateusz Lango

2023

2020

2018

2016

Co-authors

Venues