Yi Zhu
Other people with similar names: Yi Zhu, Yi Zhu, Yi Zhu
Unverified author pages with similar names: Yi Zhu
2026
Chinese Live-Streaming E-Commerce Morph Resolution: Datasets and Methods
Jipeng Qiang | Jiahao Zhu | Yi Zhu | Chaowei Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Jipeng Qiang | Jiahao Zhu | Yi Zhu | Chaowei Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Live-stream E-commerce faces significant challenges from morphs, deliberate linguistic variants used to evade real-time voice filters and amplify product claims illegally. While critical for regulatory enforcement, Live Auditory Morph Resolution (LiveAMR) research is hindered by limited datasets: prior work relied on narrow, redundant health domain corpora, restricting model robustness. To bridge this gap, we introduce two datasets: (1) HealthAMR, a refined health-domain corpus via deduplication and re-annotation. (2) GeneralAMR, a general domain benchmark with 28K annotated sentences from 77 channels across 7 E-commerce categories. Further, we propose JointMRE, a multi-task framework that jointly resolves morphs and generates structured explanations, transferring grammatical insights from large language models to enhance generalization. Predictions are refined by our Conflict-aware Dual-output Refinement Framework (CDRF), which detects inconsistencies between corrections and explanations. Experiments show CDRF significantly improves morph resolution accuracy and interpretability. Our datasets and code are available [<https://anonymous.4open.science/r/Morph-Resolution-Datasets-and-Methods-611E>].
2025
AI4Reading: Chinese Audiobook Interpretation System Based on Multi-Agent Collaboration
Minjiang Huang | Jipeng Qiang | Yi Zhu | Chaowei Zhang | Xiangyu Zhao | Kui Yu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Minjiang Huang | Jipeng Qiang | Yi Zhu | Chaowei Zhang | Xiangyu Zhao | Kui Yu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Audiobook interpretations are attracting increasing attention, as they provide accessible and in-depth analyses of books that offer readers practical insights and intellectual inspiration. However, their manual creation process remains time-consuming and resource-intensive. To address this challenge, we propose AI4Reading, a multi-agent collaboration system leveraging large language models (LLMs) and speech synthesis technology to generate podcast-like audiobook interpretations. The system is designed to meet three key objectives: accurate content preservation, enhanced comprehensibility, and a logical narrative structure. To achieve these goals, We develop a framework composed of 11 specialized agents—including topic analysts, case analysts, editors, a narrator, and proofreaders—that work in concert to explore themes, extract real-world cases, refine content organization, and synthesize natural spoken language. By comparing expert interpretations with our system’s output, the results show that although AI4Reading still has a gap in speech generation quality, the generated interpretative scripts are simpler and more accurate. The code of AI4Reading is publicly accessible , with a demonstration video available .
2023
ParaLS: Lexical Substitution via Pretrained Paraphraser
Jipeng Qiang | Kang Liu | Yun Li | Yunhao Yuan | Yi Zhu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jipeng Qiang | Kang Liu | Yun Li | Yunhao Yuan | Yi Zhu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Lexical substitution (LS) aims at finding appropriate substitutes for a target word in a sentence. Recently, LS methods based on pretrained language models have made remarkable progress, generating potential substitutes for a target word through analysis of its contextual surroundings. However, these methods tend to overlook the preservation of the sentence’s meaning when generating the substitutes. This study explores how to generate the substitute candidates from a paraphraser, as the generated paraphrases from a paraphraser contain variations in word choice and preserve the sentence’s meaning. Since we cannot directly generate the substitutes via commonly used decoding strategies, we propose two simple decoding strategies that focus on the variations of the target word during decoding. Experimental results show that our methods outperform state-of-the-art LS methods based on pre-trained language models on three benchmarks.