Mengjiao Bao
2026
TeleAI at SemEval-2026 Task 3: Large Language Models for Dimensional Aspect-Based Sentiment Analysis
Yan Zhou | Wangshicheng Wang | Shiquan Wang | Mengjiao Bao | Ruiyu Fang | Shuangyong Song | Yongxiang Li | Xuelong Li
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Yan Zhou | Wangshicheng Wang | Shiquan Wang | Mengjiao Bao | Ruiyu Fang | Shuangyong Song | Yongxiang Li | Xuelong Li
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes TeleAI’s system for SemEval-2026 Task 3, Track A, Subtask 1 (DimASR), which focuses on predicting continuous Valence-Arousal (VA) scores for specific aspects in text. We frame this task as an end-to-end regression problem and propose a robust framework utilizing Qwen2.5-7B as the feature extraction backbone, combined with parameter-efficient fine-tuning via LoRA. To enhance model generalization and mitigate domain shifts, we primarily leverage multilingual and multi-domain mixed training. Furthermore, our system integrates several optimization and robustness techniques to stabilize continuous score prediction, including R-Drop-style consistency regularization, embedding-level PGD adversarial training, Smooth L1 (Huber) loss, sigmoid-based output interval mapping, and post-hoc linear calibration. Our comprehensive ablations demonstrate that the combination of joint training and robustness regularizations substantially reduces the official evaluation metric, $RMSE{VA}$. The proposed system achieves highly competitive performance across multiple language and domain settings, demonstrating the efficacy of applying lightweight LLM adaptation for dimensional aspect-based sentiment analysis.
TrendFact: A Benchmark Towards Hotspot Perception in Automatic Fact-Checking
Xiaocheng Zhang | Xi Wang | Yifei Lu | Jianing Wang | Zhuangzhuang Ye | Mengjiao Bao | Peng Yan | Xiaohong Su
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiaocheng Zhang | Xi Wang | Yifei Lu | Jianing Wang | Zhuangzhuang Ye | Mengjiao Bao | Peng Yan | Xiaohong Su
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
With the surge of online misinformation, Large Language Models (LLMs) and Reasoning Large Language Models (RLMs) serving as Automatic Fact-Checking (AFC) systems have emerged as a prominent paradigm for reliable, explainable verification. However, our empirical study reveals that this paradigm faces a critical risk asymmetry challenge when deployed in real-world under resource-constrained environments. While Hotspot Perception Ability (HPA), the capacity to dynamically allocate reasoning resources based on social impact, is essential to mitigate this risk, existing benchmarks lack the social metadata and evaluation framework to meet this urgent evaluation needs, thereby hindering the advancement of these AFC systems. To bridge this gap, we introduce TrendFact, the first benchmark capable of evaluating HPA and three fact-checking tasks. It consists of 7,643 curated samples sourced from trending platforms and professional datasets, with an evidence library containing 366,634 entries. To enable HPA assessment, we propose two novel metrics: the Explanation Consistency Score (ECS) to evaluate the reliability of verification reasoning, and the Hotspot Claim Perception Index (HCPI) to quantify the overall HPA of AFC systems. Extensive experiments demonstrate that existing AFC systems exhibit limited performance on TrendFact. Furthermore, our proposed FactISR framework effectively enhances HPA and computational efficiency for RLM-driven systems.