Yu He
Papers on this page may belong to the following people: Yu He, Yu He
2026
HW-TSC’s Submissions to the IWSLT 2026 Offline Speech Translation Task
Boqi Huang | Daimeng Wei | Jiaxin GUO | Yuanchang Luo | Hengchao Shang | Zongyao Li | Zhiqiang Rao | Jinlong Yang | Zhanglin Wu | Yu He | Xiaoqing Lan
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
Boqi Huang | Daimeng Wei | Jiaxin GUO | Yuanchang Luo | Hengchao Shang | Zongyao Li | Zhiqiang Rao | Jinlong Yang | Zhanglin Wu | Yu He | Xiaoqing Lan
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
This paper describes the HW-TSC’s submission to the IWSLT 2026 Offline Speech Translation Task, specifically for the English-to-Chinese and English-to-German unconstrained tracks. Our system adopts a robust cascade architecture optimized for long-form, unsegmented audio. To mitigate the hallucination and inconsistency issues common in long-sequence processing, we propose a two-pass transcription strategy: an initial streaming ASR with a 12-second context buffer for sentence-level coherence, followed by Qwen3-ForcedAligner for precise timestamping. Based on these alignments, a second-pass refinement is conducted using Qwen3-Omni on re-segmented 30-second chunks to ensure high-fidelity transcriptions. For the translation module, we employ a context-aware segment merging strategy (up to 150 tokens) to empower the Qwen3 llm with sufficient semantic context. Experimental results on the tst-2022 benchmark demonstrate the effectiveness of our pipeline, achieving COMET scores of 0.8462 (En-Zh) and 0.7854 (En-De), significantly outperforming the standard cascade baselines.
HW-TSC’s Submission to the IWSLT 2026 Cross-Lingual Voice Cloning Track
Yu He | Daimeng Wei | Jiaxin GUO | Yuanchang Luo | Hengchao Shang | Zongyao Li | Zhiqiang Rao | Jinlong Yang | Zhanglin Wu | Boqi Huang | Xiaoqing Lan
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
Yu He | Daimeng Wei | Jiaxin GUO | Yuanchang Luo | Hengchao Shang | Zongyao Li | Zhiqiang Rao | Jinlong Yang | Zhanglin Wu | Boqi Huang | Xiaoqing Lan
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
This paper presents HW-TSC’s submission to the IWSLT 2026 Cross-Lingual Voice Cloning Track. The Cross-Lingual Voice Cloning Track includes three target languages: Arabic, Chinese, and French. We take part in two language tasks of this track, namely Chinese and French. We employ the Qwen3-TTS-12Hz-1.7B-Base multilingual model as the core voice cloning model. To tackle problems such as excessively long duration of the original reference audio and scattered features, we design a sliding-window audio segmentation preprocessing method, which continuously splits long audio into standardized short segments with overlapping redundancy. This method avoids feature attenuation caused by overly long audio and maximizes the preservation of complete timbre information through step overlap. To select the outputs with the highest timbre similarity from numerous synthetic results, this study conducts voiceprint recognition based on the Enhanced Context-Dependent Adversarial Time Delay Neural Network (ECAPA-TDNN), with cosine similarity as the core quantitative evaluation metric, and selects the result with the highest similarity as the optimal output.
HW-TSC’s Submission to the IWSLT 2026 Subtitling Track
Xiaoqing Lan | Daimeng Wei | Jiaxin GUO | Yuanchang Luo | Hengchao Shang | Zongyao Li | Zhiqiang Rao | Jinlong Yang | Zhanglin Wu | Boqi Huang | Yu He
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
Xiaoqing Lan | Daimeng Wei | Jiaxin GUO | Yuanchang Luo | Hengchao Shang | Zongyao Li | Zhiqiang Rao | Jinlong Yang | Zhanglin Wu | Boqi Huang | Yu He
Proceedings of the 23rd International Conference on Spoken Language Translation (IWSLT 2026)
This paper introduces HW-TSC’s submission to the IWSLT 2026 Subtitling track. For automatic subtitle generation, we employ a cascaded strategy under unconstrained conditions. First, we construct a large-model-based streaming speech recognition framework, which incorporates VAD voice activity detection, sliding-window context caching, long audio chunking, and the Qwen3 forced alignment model to achieve high-precision transcription and timestamping from English speech to text. Next, we perform text translation using a Qwen3-based translation model. Finally, according to subtitle constraints such as characters per second (CPS) and characters per line (CPL), we identify translation segments that exceed compliance thresholds via quantitative evaluation, and rewrite them using a large language model while preserving core semantic meaning, ultimately producing subtitle files that meet the required standards.
MAXS: Meta-Adaptive Exploration with LLM Agents
Jian Zhang | Zhiyuan Wang | Zhangqi Wang | Yu He | Haoran Luo | li Yuan | Lingling Zhang | Rui Mao | Qika Lin | Jun Liu
Findings of the Association for Computational Linguistics: ACL 2026
Jian Zhang | Zhiyuan Wang | Zhangqi Wang | Yu He | Haoran Luo | li Yuan | Lingling Zhang | Rui Mao | Qika Lin | Jun Liu
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Model (LLM) Agents exhibit inherent reasoning abilities through the collaboration of multiple tools.However, during agent inference, existing methods often suffer from (i) locally myopic generation, due to the absence of lookahead, and (ii) trajectory instability, where minor early errors can escalate into divergent reasoning paths. These issues make it difficult to balance global effectiveness and computational efficiency. To address these two issues, we propose meta-adaptive exploration with LLM agents (MAXS)[<https://github.com/exoskeletonzj/MAXS>], a meta-adaptive reasoning framework based on LLM Agents that flexibly integrates tool execution and reasoning planning. MAXS employs a lookahead strategy to extend reasoning paths a few steps ahead, estimating the advantage value of tool usage, and combines step consistency variance and inter-step trend slopes to jointly select stable, consistent, and high-value reasoning steps. Additionally, we introduce a trajectory convergence mechanism that controls computational cost by halting further rollouts once path consistency is achieved, enabling a balance between resource efficiency and global effectiveness in multi-tool reasoning. We conduct extensive empirical studies across three base models (MiMo-VL-7B, Qwen2.5-VL-7B, Qwen2.5-VL-32B) and five datasets, demonstrating that MAXS consistently outperforms existing methods in both performance and inference efficiency. Further analysis confirms the effectiveness of our lookahead strategy and tool usage.
Natural-Language Policies to Executable Decisions: An Interpretable Large Language Model Framework
Ziqiang Zhang | Jing Ma | Zilong Wang | Jiayuan Chen | Yi Qiao | Yu He | Wei Zhang | Dai Cheng | Xiaoyu Shen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Ziqiang Zhang | Jing Ma | Zilong Wang | Jiayuan Chen | Yi Qiao | Yu He | Wei Zhang | Dai Cheng | Xiaoyu Shen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Pricing automation in large-scale tourism is challenging because travel orders are highly unstructured, while pricing policies are complex, rapidly evolving, and inherently open-ended. Traditional rule engines are brittle and costly to maintain, whereas unconstrained LLM agents lack the reliability and auditability required for financial decisions. We present a production-grade LLM-powered pricing system with a strict decision boundary: LLMs perform structured extraction and bounded policy/path selection, while all numeric pricing, including total-price computation, is executed deterministically. Policies are compiled into interpretable condition trees, enabling open-ended support for new clauses and evolving rules without code changes, while exposing auditable artifacts for human-in-the-loop control. Periodic fine-tuning on logged traces further improves tree induction and path matching. Deployed at a municipal state-owned tourism enterprise across 7 scenic sites and 12 business categories with 1,500+ operators and 1,000+ active policies, the system processed 3,960 orders in six months, reduced the order management team from 15-20 to 3, and cut per-order handling time from 10 minutes to <2 minutes.
2025
ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis
Zeao Tu | Xiangdi Meng | Yu He | Zihan Yao | Tianyu Qi | Jun Liu | Ming Li
Findings of the Association for Computational Linguistics: NAACL 2025
Zeao Tu | Xiangdi Meng | Yu He | Zihan Yao | Tianyu Qi | Jun Liu | Ming Li
Findings of the Association for Computational Linguistics: NAACL 2025
Large language models (LLMs) have shown remarkable effectiveness across various domains, with data augmentation methods utilizing GPT for synthetic data generation becoming prevalent. However, the quality and utility of augmented data remain questionable, and current methods lack clear metrics for evaluating data characteristics. To address these challenges, we propose ResoFilter, a novel method that integrates models, data, and tasks to refine datasets. ResoFilter leverages the fine-tuning process to obtain Data-Parameter features for data selection, offering improved interpretability by representing data characteristics through model weights. Our experiments demonstrate that ResoFilter achieves comparable results to full-scale fine-tuning using only half the data in mathematical tasks and exhibits strong generalization across different models and domains. This method provides valuable insights for constructing synthetic datasets and evaluating high-quality data, offering a promising solution for enhancing data augmentation techniques and improving training dataset quality for LLMs. For reproducibility, we will release our code and data upon acceptance.
Are LLMs Rational Investors? A Study on the Financial Bias in LLMs
Yuhang Zhou | Yuchen Ni | Zhiheng Xi | Zhangyue Yin | Yu He | Gan Yunhui | Xiang Liu | Zhang Jian | Sen Liu | Xipeng Qiu | Yixin Cao | Guangnan Ye | Hongfeng Chai
Findings of the Association for Computational Linguistics: ACL 2025
Yuhang Zhou | Yuchen Ni | Zhiheng Xi | Zhangyue Yin | Yu He | Gan Yunhui | Xiang Liu | Zhang Jian | Sen Liu | Xipeng Qiu | Yixin Cao | Guangnan Ye | Hongfeng Chai
Findings of the Association for Computational Linguistics: ACL 2025
Large language models (LLMs) excel in natural language generation but also exhibit biases, particularly in gender, race, and religion, which can be amplified with widespread use. However, research on biases in specific domains, such as finance, remains limited. To address this gap, we conducted a comprehensive evaluation of 23 leading LLMs and found varying degrees of financial bias, including more pronounced biases in financial-specific LLMs (FinLLMs). In response, we propose the Financial Bias Indicators (FBI) framework, which includes components like the Bias Unveiler, Bias Detective, Bias Tracker, and Bias Antidote, designed to identify, detect, analyze, and mitigate financial biases. Our analysis explores the root causes of these biases and introduces a debiasing method based on financial causal knowledge, alongside three other debiasing techniques. For the most biased model, we successfully reduced bias by 68% according to key metrics. This study advances our understanding of LLM biases in finance and highlights the need for greater scrutiny in their application within this critical domain.
2024
R3-NL2GQL: A Model Coordination and Knowledge Graph Alignment Approach for NL2GQL
Yuhang Zhou | Yu He | Siyu Tian | Yuchen Ni | Zhangyue Yin | Xiang Liu | Chuanjun Ji | Sen Liu | Xipeng Qiu | Guangnan Ye | Hongfeng Chai
Findings of the Association for Computational Linguistics: EMNLP 2024
Yuhang Zhou | Yu He | Siyu Tian | Yuchen Ni | Zhangyue Yin | Xiang Liu | Chuanjun Ji | Sen Liu | Xipeng Qiu | Guangnan Ye | Hongfeng Chai
Findings of the Association for Computational Linguistics: EMNLP 2024
While current tasks of converting natural language to SQL (NL2SQL) using Foundation Models have shown impressive achievements, adapting these approaches for converting natural language to Graph Query Language (NL2GQL) encounters hurdles due to the distinct nature of GQL compared to SQL, alongside the diverse forms of GQL. Moving away from traditional rule-based and slot-filling methodologies, we introduce a novel approach, R3-NL2GQL, integrating both small and large Foundation Models for ranking, rewriting, and refining tasks. This method leverages the interpretative strengths of smaller models for initial ranking and rewriting stages, while capitalizing on the superior generalization and query generation prowess of larger models for the final transformation of natural language queries into GQL formats. Addressing the scarcity of datasets in this emerging field, we have developed a bilingual dataset, sourced from graph database manuals and selected open-source Knowledge Graphs (KGs). Our evaluation of this methodology on this dataset demonstrates its promising efficacy and robustness.
2022
Beyond Emotion: A Multi-Modal Dataset for Human Desire Understanding
Ao Jia | Yu He | Yazhou Zhang | Sagar Uprety | Dawei Song | Christina Lioma
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Ao Jia | Yu He | Yazhou Zhang | Sagar Uprety | Dawei Song | Christina Lioma
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Desire is a strong wish to do or have something, which involves not only a linguistic expression, but also underlying cognitive phenomena driving human feelings. As the most primitive and basic human instinct, conscious desire is often accompanied by a range of emotional responses. As a strikingly understudied task, it is difficult for machines to model and understand desire due to the unavailability of benchmarking datasets with desire and emotion labels. To bridge this gap, we present MSED, the first multi-modal and multi-task sentiment, emotion and desire dataset, which contains 9,190 text-image pairs, with English text. Each multi-modal sample is annotated with six desires, three sentiments and six emotions. We also propose the state-of-the-art baselines to evaluate the potential of MSED and show the importance of multi-task and multi-modal clues for desire understanding. We hope this study provides a benchmark for human desire analysis. MSED will be publicly available for research.
2015
Polarity Classification of Short Product Reviews via Multiple Cluster-based SVM Classifiers
Jiaying Song | Yu He | Guohong Fu
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters
Jiaying Song | Yu He | Guohong Fu
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters
2014
Improving Chinese Sentence Polarity Classification via Opinion Paraphrasing
Guohong Fu | Yu He | Jiaying Song | Chaoyue Wang
Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing
Guohong Fu | Yu He | Jiaying Song | Chaoyue Wang
Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing
2013
Search
Fix author
Co-authors
- Guohong Fu 3
- Jiaxin Guo 3
- Boqi Huang 3
- Xiaoqing Lan 3
- Zongyao Li 3
- Yuanchang Luo 3
- Zhiqiang Rao 3
- Hengchao Shang 3
- Daimeng Wei 3
- Zhanglin Wu 3
- Jinlong Yang 3
- Hongfeng Chai (柴洪峰) 2
- Jun Liu 2
- Sen Liu 2
- Xiang Liu 2
- Yuchen Ni 2
- Xipeng Qiu (邱锡鹏) 2
- Jiaying Song 2
- Guangnan Ye (叶广楠) 2
- Zhangyue Yin 2
- Yuhang Zhou (周宇航) 2
- Yixin Cao 1
- Jiayuan Chen 1
- Dai Cheng 1
- Chuanjun Ji 1
- Ao Jia 1
- Zhang Jian 1
- Ming Li 1
- Qika Lin 1
- Christina Lioma 1
- Haoran Luo 1
- Jing Ma 1
- Rui Mao 1
- Xiangdi Meng 1
- Tianyu Qi 1
- Yi Qiao 1
- Xiaoyu Shen 1
- Dawei Song 1
- Siyu Tian 1
- Zeao Tu 1
- Sagar Uprety 1
- Chaoyue Wang 1
- Zhangqi Wang 1
- Zhiyuan Wang 1
- Zilong Wang 1
- Zhiheng Xi 1
- Zihan Yao 1
- Li Yuan 1
- Gan Yunhui 1
- Jian Zhang 1
- Lingling Zhang 1
- Wei Zhang 1
- Yazhou Zhang 1
- Ziqiang Zhang 1