Małgorzata Grębowiec
2025
Unveiling Dual Quality in Product Reviews: An NLP-Based Approach
Rafał Poświata
|
Marcin Michał Mirończuk
|
Sławomir Dadas
|
Małgorzata Grębowiec
|
Michał Perełkiewicz
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Consumers often face inconsistent product quality, particularly when identical products vary between markets, a situation known as the dual quality problem. To identify and address this issue, automated techniques are needed. This paper explores how natural language processing (NLP) can aid in detecting such discrepancies and presents the full process of developing a solution. First, we describe in detail the creation of a new Polish-language dataset with 1,957 reviews, 540 highlighting dual quality issues. We then discuss experiments with various approaches like SetFit with sentence-transformers, transformer-based encoders, and LLMs, including error analysis and robustness verification. Additionally, we evaluate multilingual transfer using a subset of opinions in English, French, and German. The paper concludes with insights on deployment and practical applications.
2023
OPI PIB at SemEval-2023 Task 1: A CLIP-based Solution Paired with an Additional Word Context Extension
Małgorzata Grębowiec
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This article presents our solution for SemEval-2023 Task 1: Visual Word Sense Disambiguation. The aim of the task was to select the most suitable from a list of ten images for a given word, extended by a small textual context. Our solution comprises two parts. The first focuses on an attempt to further extend the textual context, based on word definitions contained in WordNet and in Open English WordNet. The second focuses on selecting the most suitable image using the CLIP model with previously developed word context and additional information obtained from the BEiT image classification model. Our solution allowed us to achieve a result of 70.84% on the official test dataset for the English language.