This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
LitingHuang
Fixing paper assignments
Please select all papers that do not belong to this person.
Indicate below which author they should be assigned to.
Infodemics and health misinformation have significant negative impact on individuals and society, exacerbating confusion and increasing hesitancy in adopting recommended health measures. Recent advancements in generative AI, capable of producing realistic, human-like text and images, have significantly accelerated the spread and expanded the reach of health misinformation, resulting in an alarming surge in its dissemination. To combat the infodemics, most existing work has focused on developing misinformation datasets from social media and fact-checking platforms, but has faced limitations in topical coverage, inclusion of AI-generation, and accessibility of raw content. To address these gaps, we present MM-Health, a large scale multimodal misinformation dataset in the health domain consisting of 34,746 news article encompassing both textual and visual information. MM-Health includes human-generated multimodal information (5,776 articles) and AI-generated multimodal information (28,880 articles) from various SOTA generative AI models. Additionally, We benchmarked our dataset against three tasks—reliability checks, originality checks, and fine-grained AI detection—demonstrating that existing SOTA models struggle to accurately distinguish the reliability and origin of information. Our dataset aims to support the development of misinformation detection across various health scenarios, facilitating the detection of human and machine-generated content at multimodal levels.
SemEval-2025 Task 1 focuses on ranking images based on their alignment with a given nominal compound that may carry idiomatic meaning in both English and Brazilian Portuguese. To address this challenge, this work uses generative large language models (LLMs) and multilingual CLIP models to enhance idiomatic compound representations. LLMs generate idiomatic meanings for potentially idiomatic compounds, enriching their semantic interpretation. These meanings are then encoded using multilingual CLIP models, serving as representations for image ranking. Contrastive learning and data augmentation techniques are applied to fine-tune these embeddings for improved performance.Experimental results show that multimodal representations extracted through this method outperformed those based solely on the original nominal compounds. The fine-tuning approach shows promising outcomes but is less effective than using embeddings without fine-tuning.
This paper presents a framework for perceived emotion intensity prediction, focusing on SemEval-2025 Task 11 Track B. The task involves predicting the intensity of five perceived emotions—anger, fear, joy, sadness, and surprise—on an ordinal scale from 0 (no emotion) to 3 (high emotion). Our approach builds upon our method introduced in the WASSA workshop and enhances it by integrating ModernBERT in place of the traditional BERT model within a boosting-based ensemble framework. To address the difficulty in capturing fine-grained emotional distinctions, we incorporate class-preserving mixup data augmentation, a custom Pearson CombinLoss function, and fine-tuned transformer models, including ModernBERT, RoBERTa, and DeBERTa. Compared to individual fine-tuned transformer models (BERT, RoBERTa, DeBERTa and ModernBERT) without augmentation or ensemble learning, our approach demonstrates significant improvements. The proposed system achieves an average Pearson correlation coefficient of 0.768 on the test set, outperforming the best individual baseline model. In particular, the model performs best for sadness (r = 0.808) and surprise (r = 0.770), highlighting its ability to capture subtle intensity variations in the text. Despite these improvements, challenges such as data imbalance, performance on low-resource emotions (e.g., anger and fear), and the need for refined data augmentation techniques remain open for future research.
In the realm of conversational empathy and emotion prediction, emotions are frequently categorized into multiple levels. This study seeks to enhance the performance of emotion prediction models by incorporating the Pearson correlation coefficient as a regularization term within the loss function. This regularization approach ensures closer alignment between predicted and actual emotion levels, mitigating extreme predictions and resulting in smoother and more consistent outputs. Such outputs are essential for capturing the subtle transitions between continuous emotion levels. Through experimental comparisons between models with and without Pearson regularization, our findings demonstrate that integrating the Pearson correlation coefficient significantly boosts model performance, yielding higher correlation scores and more accurate predictions. Our system officially ranked 9th at the Track 2: CONV-turn. The code for our model can be found at Link .
This paper presents our participation to the WASSA 2024 Shared Task on Empathy Detection and Emotion Classification and Personality Detection in Interactions. We focus on Track 2: Empathy and Emotion Prediction in Conversations Turns (CONV-turn), which consists of predicting the perceived empathy, emotion polarity and emotion intensity at turn level in a conversation. In the method, we conduct BERT and DeBERTa based finetuning, implement the CombinedLoss which consists of a structured contrastive loss and Pearson loss, adopt adversarial training using Fast Gradient Method (FGM). This method achieved Pearson correlation of 0.581 for Emotion,0.644 for Emotional Polarity and 0.544 for Empathy on the test set, with the average value of 0.590 which ranked 4th among all teams. After submission to WASSA 2024 competition, we further introduced the segmented mix-up for data augmentation, boosting for ensemble and regression experiments, which yield even better results: 0.6521 for Emotion, 0.7376 for EmotionalPolarity, 0.6326 for Empathy in Pearson correlation on the development set. The implementation and fine-tuned models are publicly-available at https://github.com/hyy-33/hyy33-WASSA-2024-Track-2.