Abstract
In this paper, we proposed and explored the impact of four different dataset augmentation andextension strategies that we used for solving the subtask 3 of SemEval-2023 Task 3: multi-label persuasion techniques classification in a multi-lingual context. We consider two types of augmentation methods (one based on a modified version of synonym replacement and one based on translations) and two ways of extending the training dataset (using filtered data generated by GPT-3 and using a dataset from a previous competition). We studied the effects of the aforementioned techniques by using theaugmented and/or extended training dataset to fine-tune a pretrained XLM-RoBERTa-Large model. Using the augmentation methods alone, we managed to obtain 3rd place for English, 13th place for Italian and between the 5th to 9th places for the other 7 languages during the competition.- Anthology ID:
- 2023.semeval-1.84
- Volume:
- Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 616–623
- Language:
- URL:
- https://aclanthology.org/2023.semeval-1.84
- DOI:
- 10.18653/v1/2023.semeval-1.84
- Cite (ACL):
- Sergiu Amihaesei, Laura Cornei, and George Stoica. 2023. Appeal for Attention at SemEval-2023 Task 3: Data augmentation extension strategies for detection of online news persuasion techniques. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 616–623, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Appeal for Attention at SemEval-2023 Task 3: Data augmentation extension strategies for detection of online news persuasion techniques (Amihaesei et al., SemEval 2023)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2023.semeval-1.84.pdf