George Stoica
2023
Appeal for Attention at SemEval-2023 Task 3: Data augmentation extension strategies for detection of online news persuasion techniques
Sergiu Amihaesei
|
Laura Cornei
|
George Stoica
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
In this paper, we proposed and explored the impact of four different dataset augmentation andextension strategies that we used for solving the subtask 3 of SemEval-2023 Task 3: multi-label persuasion techniques classification in a multi-lingual context. We consider two types of augmentation methods (one based on a modified version of synonym replacement and one based on translations) and two ways of extending the training dataset (using filtered data generated by GPT-3 and using a dataset from a previous competition). We studied the effects of the aforementioned techniques by using theaugmented and/or extended training dataset to fine-tune a pretrained XLM-RoBERTa-Large model. Using the augmentation methods alone, we managed to obtain 3rd place for English, 13th place for Italian and between the 5th to 9th places for the other 7 languages during the competition.
Search