Zehan Li
Other people with similar names: Zehan Li
Unverified author pages with similar names: Zehan Li
2026
Zero-shot Jianzi Recognition as Structured Visual Information Extraction in Open Compositional Symbolic Systems
Zehan Li | Fu Zhang | Zhijun Liu | Jingwei Cheng
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zehan Li | Fu Zhang | Zhijun Liu | Jingwei Cheng
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Guqin (古琴) Jianzi (減字) is an open and freely compositional tablature system that encodes performance actions rather than acoustic outcomes. Its automatic recognition remains largely unexplored, as conventional OCR assumes a closed and enumerable glyph set and struggles with Jianzi’s unbounded composition and manuscript-level variability.We introduce Zero-shot Jianzi Recognition, which formulates Jianzi recognition as vision-to-sequence prediction of canonical component sequences under a zero-shot split. To enable scalable supervision, we construct Synthetic-JZ from aligned online composition metadata. We then synthesize manuscript-like training images via component-wise style recomposition and manuscript-domain noise modeling, and fine-tune a vision–language model for end-to-end component sequence recognition. At inference time, a lightweight legality-guided correction module re-ranks decoding candidates, suppressing structural hallucinations without modifying the backbone.Experiments on two benchmarks show that our method achieves 63.02% sequence accuracy on Real-JZ, our manually annotated real-world Jianzi benchmark, surpassing Gemini-3-Pro by 35.11%. This result highlights the feasibility of reliable automated Jianzi recognition and its potential for large-scale digitization of historical Guqin Jianzi Pu manuscripts.
2025
Generation-Augmented Retrieval: Rethinking the Role of Large Language Models in Zero-Shot Relation Extraction
Zehan Li | Fu Zhang | Tianyue Peng | He Liu | Jingwei Cheng
Findings of the Association for Computational Linguistics: EMNLP 2025
Zehan Li | Fu Zhang | Tianyue Peng | He Liu | Jingwei Cheng
Findings of the Association for Computational Linguistics: EMNLP 2025
Recent advances in Relation Extraction (RE) emphasize Zero-Shot methodologies, aiming to recognize unseen relations between entities with no annotated data. Although Large Language Models (LLMs) have demonstrated outstanding performance in many NLP tasks, their performance in Zero-Shot RE (ZSRE) without entity type constraints still lags behind Small Language Models (SLMs). LLM-based ZSRE often involves manual interventions and significant computational overhead, especially when scaling to large-scale multi-choice data.To this end, we introduce RE-GAR-AD, which not only leverages the generative capability of LLMs but also utilizes their representational power without tuning LLMs. We redefine LLM-based ZSRE as a retrieval challenge, utilizing a Generation-Augmented Retrieval framework coupled with a retrieval Adjuster. Specifically, our approach guides LLMs through crafted prompts to distill sentence semantics and enrich relation labels. We encode sentences and relation labels using LLMs and match their embeddings in a triplet fashion. This retrieval technique significantly reduces token input requirements. Additionally, to further optimize embeddings, we propose a plug-in retrieval adjuster with only 2M parameters, which allows rapid fine-tuning without accessing LLMs’ parameters. Our LLM-based model demonstrates comparable performance on multiple benchmarks.
Frame First, Then Extract: A Frame-Semantic Reasoning Pipeline for Zero-Shot Relation Triplet Extraction
Zehan Li | Fu Zhang | Wenqing Zhang | Jiawei Li | Zhou Li | Jingwei Cheng | Tianyue Peng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Zehan Li | Fu Zhang | Wenqing Zhang | Jiawei Li | Zhou Li | Jingwei Cheng | Tianyue Peng
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) have shown impressive capabilities in language understanding and generation, leading to growing interest in zero-shot relation triplet extraction (ZeroRTE), a task that aims to extract triplets for unseen relations without annotated data. However, existing methods typically depend on costly fine-tuning and lack the structured semantic guidance required for accurate and interpretable extraction. To overcome these limitations, we propose FrameRTE, a novel ZeroRTE framework that adopts a “frame first, then extract” paradigm. Rather than extracting triplets directly, FrameRTE first constructs high-quality Relation Semantic Frames (RSFs) through a unified pipeline that integrates frame retrieval, synthesis, and enhancement. These RSFs serve as structured and interpretable knowledge scaffolds that guide frozen LLMs in the extraction process. Building upon these RSFs, we further introduce a human-inspired three-stage reasoning pipeline consisting of semantic frame evocation, frame-guided triplet extraction, and core frame elements validation to achieve semantically constrained extraction. Experiments demonstrate that FrameRTE achieves competitive zero-shot performance on multiple benchmarks. Moreover, the RSFs we construct serve as high-quality semantic resources that can enhance other extraction methods, showcasing the synergy between linguistic knowledge and foundation models.
2024
AlignRE: An Encoding and Semantic Alignment Approach for Zero-Shot Relation Extraction
Zehan Li | Fu Zhang | Jingwei Cheng
Findings of the Association for Computational Linguistics: ACL 2024
Zehan Li | Fu Zhang | Jingwei Cheng
Findings of the Association for Computational Linguistics: ACL 2024
Zero-shot Relation Extraction (ZSRE) aims to predict unseen relations between entity pairs from input sentences. Existing prototype-based ZSRE methods encode relation descriptions into prototype embeddings and predict by measuring the similarity between sentence embeddings and prototype embeddings. However, these methods often overlook abundant side information of relations and suffer from a significant encoding gap between prototypes and sentences, limiting performance. To this end, we propose a framework named AlignRE, based on two Alignment methods for ZSRE. Specifically, we present a novel perspective centered on encoding schema alignment to enhance prototype-based ZSRE methods. We utilize well-designed prompt-tuning to bridge the encoding gap. To improve prototype quality, we explore and leverage multiple side information and propose a prototype aggregation method based on semantic alignment to create comprehensive relation prototype representations. We conduct experiments on FewRel and Wiki-ZSL datasets and consistently outperform state-of-the-art methods. Moreover, our method exhibits substantially faster performance and reduces the need for extensive manual labor in prototype construction. Code is available at https://github.com/lizehan1999/AlignRE.