Qiang Zhang

Other people with similar names: Qiang Zhang , Qiang Zhang


2025

pdf bib
RIVAL: Reinforcement Learning with Iterative and Adversarial Optimization for Machine Translation
Tianjiao Li | Mengran Yu | Chenyu Shi | Yanjun Zhao | Xiaojing Liu | Qi Zhang | Xuanjing Huang | Qiang Zhang | Jiayin Wang
Findings of the Association for Computational Linguistics: EMNLP 2025

Large language models (LLMs) possess strong multilingual capabilities, and combining Reinforcement Learning from Human Feedback (RLHF) with translation tasks has shown great potential. However, we observe that this paradigm performs unexpectedly poorly when applied to colloquial subtitle translation tasks. In this work, we investigate this issue and find that the offline reward model (RM) gradually diverges from the online LLM due to distributional shift, ultimately leading to undesirable training outcomes. To address this, we propose RIVAL, an adversarial training framework that formulates the process as a min–max game between the RM and the LLM. RIVAL iteratively updates the both models, with the RM trained to distinguish strong from weak translations (qualitative preference reward), and the LLM trained to enhance its translation for closing this gap. To stabilize training and improve generalizability, we also incorporate quantitative preference reward (e.g., BLEU) into the RM, enabling reference-free quality modeling aligned with human evaluation. Through extensive experiments, we demonstrate that the proposed training framework significantly improves upon translation baselines.

pdf bib
Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
Weihang Wang | Xinhao Li | Ziyue Wang | Yan Pang | Jielei Zhang | Peiyi Li | Qiang Zhang | Longwen Gao
Findings of the Association for Computational Linguistics: EMNLP 2025

Object hallucinations in Large Vision-Language Models (LVLMs) significantly impede their real-world applicability. As the primary component for accurately interpreting visual information, the choice of visual encoder is pivotal. We hypothesize that the diverse training paradigms employed by different visual encoders instill them with distinct inductive biases, which leads to their diverse hallucination performances. Existing benchmarks typically focus on coarse-grained hallucination detection and fail to capture the diverse hallucinations elaborated in our hypothesis. To systematically analyze these effects, we introduce VHBench-10, a comprehensive benchmark for evaluating LVLMs across ten fine-grained hallucination categories. Our evaluations confirm encoders exhibit unique hallucination characteristics. Building on these insights and the suboptimality of simple feature fusion, we propose VisionWeaver, a novel Context-Aware Routing Network. It employs global visual features to generate routing signals, dynamically aggregating visual features from multiple specialized experts. Comprehensive experiments confirm the effectiveness of VisionWeaver in significantly reducing hallucinations and improving overall model performance. Our code and benchmark are available at https://github.com/whwangovo/VisionWeaver.