Wenhao Li
Other people with similar names: Wenhao Li
Unverified author pages with similar names: Wenhao Li
2026
MEIC-DT: Memory-Efficient Incremental Clustering for Long-Text Coreference Resolution with Dual-Threshold Constraints
Kangyang Luo | Shuzheng Si | Yuzhuo Bai | Cheng Gao | Zhitong Wang | Cheng Huang | Yingli Shen | Yufeng Han | Wenhao Li | Cunliang Kong | Maosong Sun
Findings of the Association for Computational Linguistics: ACL 2026
Kangyang Luo | Shuzheng Si | Yuzhuo Bai | Cheng Gao | Zhitong Wang | Cheng Huang | Yingli Shen | Yufeng Han | Wenhao Li | Cunliang Kong | Maosong Sun
Findings of the Association for Computational Linguistics: ACL 2026
In the era of large language models (LLMs), supervised neural methods remain the state-of-the-art (SOTA) for Coreference Resolution. Yet, their full potential is underexplored, particularly in incremental clustering, which faces the critical challenge of balancing efficiency with performance for long texts. To address the limitation, we propose MEIC-DT, a novel dual-threshold, memory-efficient incremental clustering approach based on a lightweight Transformer. MEIC-DT features a dual-threshold constraint mechanism designed to precisely control the Transformer’s input scale within a predefined memory budget. This mechanism incorporates two key components: a Statistics-Aware Eviction Strategy (SAES) and an Internal Regularization Policy (IRP). The SAES utilizes distinct statistical profiles from the training and inference phases for intelligent cache management. The IRP strategically condenses clusters by selecting the most representative mentions, thereby preserving semantic integrity. Extensive experiments on common benchmarks demonstrate that MEIC-DT achieves highly competitive coreference performance under stringent memory constraints.
From Scaffolding to Assimilation: Progressive Structural Internalization for Format-Constrained Creative Text Generation
Wenhao Li | Yuwei Yang | Xiaoqing Wu | Yufeng Han | Cunliang Kong | Yuzhuo Bai | Xin Cong | Maosong Sun
Findings of the Association for Computational Linguistics: ACL 2026
Wenhao Li | Yuwei Yang | Xiaoqing Wu | Yufeng Han | Cunliang Kong | Yuzhuo Bai | Xin Cong | Maosong Sun
Findings of the Association for Computational Linguistics: ACL 2026
While Large Language Models (LLMs) demonstrate remarkable capabilities in open-ended creative generation, they notably struggle with Format-Constrained Generation tasks—such as poetry and lyrics—where strict adherence to multidimensional structural constraints (i.e., format, phonetics, and rhyme) is prerequisite to aesthetic value. Existing paradigms predominantly rely on unreliable prompting or rigid constrained decoding strategies; the former often fails to ensure compliance, while the latter compromises inference latency and disrupts the natural probability distribution, degrading generation quality. To bridge this gap, we establish CCP-Arena, a rigorous testbed for Chinese Classical Poetry, and proposeProgressive Structural Internalization (PSI) a novel framework designed to embed external constraints into the model’s intrinsic intuition. PSI initiates withStructural Scaffolding via Explicit Cognitive Planning, utilizing explicit template to provide a structural scaffold for subsequent generation. This is followed by a Cascaded Reinforcement Learning stage guided by a Holistic Reward Model, which optimizes for precise structural-semantic alignment. Extensive experiments demonstrate that PSI achieves state-of-the-art performance, surpassing baselines in both strict constraint adherence and literary aesthetics. Furthermore, mechanistic analysis confirms that our method effectively internalizes structural information into the model’s latent representations, offering a robust and efficient solution for constrained creative generation.
ImCoref-CeS: An Improved Lightweight Pipeline for Coreference Resolution with LLM-based Checker-Splitter Refinement
Kangyang Luo | Yuzhuo Bai | Shuzheng Si | Cheng Gao | Zhitong Wang | Yingli Shen | Wenhao Li | Zhu Liu | Yufeng Han | Jiayi Wu | Cunliang Kong | Maosong Sun
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Kangyang Luo | Yuzhuo Bai | Shuzheng Si | Cheng Gao | Zhitong Wang | Yingli Shen | Wenhao Li | Zhu Liu | Yufeng Han | Jiayi Wu | Cunliang Kong | Maosong Sun
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Coreference Resolution (CR) is a critical task in Natural Language Processing (NLP). Current research faces a key dilemma: whether to further explore the potential of supervised neural methods based on small language models, whose detect-then-cluster pipeline still delivers top performance, or embrace the powerful capabilities of Large Language Models (LLMs). However, effectively combining their strengths remains underexplored. To this end, we propose ImCoref-CeS, a novel framework that integrates an enhanced supervised model with LLM-based reasoning. First, we present an improved CR method (ImCoref) to push the performance boundaries of the supervised neural method by introducing a lightweight bridging module to enhance long-text encoding capability, devising a biaffine scorer to comprehensively capture positional information, and invoking a hybrid mention regularization to improve training efficiency. Importantly, we employ an LLM acting as a multi-role Checker-Splitter agent to validate candidate mentions (filtering out invalid ones) and coreference results (splitting erroneous clusters) predicted by ImCoref. Extensive experiments demonstrate the effectiveness of ImCoref-CeS, which achieves superior performance compared to existing state-of-the-art (SOTA) methods.