This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
AbdulSamad
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
This paper presents a solution to the food hazard detection challenge in the SemEval-2025 Task 9, focusing on overcoming class imbalance using data augmentation techniques. We employ large language models (LLMs) like GPT-4o, Gemini Flash 1.5, and T5 to generate synthetic data, alongside other methods like synonym replacement, back-translation, and paraphrasing. These augmented datasets are used to fine-tune transformer-based models such as DistilBERT, improving their performance in detecting food hazards and categorizing products. Our approach achieves notable improvements in macro-F1 scores for both subtasks, although challenges remain in detecting implicit hazards and handling extreme class imbalance. The paper also discusses various techniques, including class weighting and ensemble modeling, as part of the training process. Despite the improvements, further work is necessary to refine hazard detection, particularly for rare and implicit categories.
Our team, Narrative Miners, participated in SemEval-2025 Task 10 to tackle the challenge of detecting manipulative narratives in online news, focusing on the Ukraine-Russia war and climate change. We worked on three key subtasks: classifying entity roles, categorizing narratives and subnarratives, and generating concise narrative explanations. Using transformer-based models like BART, BERT, GPT-2, and Flan-T5, we implemented a structured pipeline and applied data augmentation to enhance performance. BART-CNN proved to be our best-performing model, significantly improving classification accuracy and explanation generation. Despite challenges like dataset limitations and class imbalance, our approach demonstrated the effectiveness of hierarchical classification and multilingual analysis in combating online disinformation. We made use of different data augmentation techniques to cover the class imbalances present in the dataset. We had different evaluation metrics set for each subtask, specifically focusing on the need of that particular outcome. With this paper, we hope to play our part in mitigating the impact of harmful disinformation.
Emotion detection in text has emerged as a pivotal challenge in Natural Language Processing (NLP), particularly in multilingual and cross-lingual contexts. This paper presents our participation in SemEval 2025 Task 11, focusing on three subtasks: Multi-label Emotion Detection, Emotion Intensity Prediction, and Cross-lingual Emotion Detection. Leveraging state-of-the-art transformer models such as BERT and XLM-RoBERTa, we implemented baseline models and ensemble techniques to enhance predictive accuracy. Additionally, innovative approaches like data augmentation and translation-based cross-lingual emotion detection were used to address linguistic and class imbalances. Our results demonstrated significant improvements in F1 scores and Pearson correlations, showcasing the effectiveness of ensemble learning and transformer-based architectures in emotion recognition. This work advances the field by providing robust methods for emotion detection, particularly in low-resource and multilingual settings.