Huangming Xu

2025

pdf bib abs
Entity Pair-guided Relation Summarization and Retrieval in LLMs for Document-level Relation Extraction
Fu Zhang | Hongsen Yu | Jingwei Cheng | Huangming Xu
Findings of the Association for Computational Linguistics: NAACL 2025

Document-level relation extraction (DocRE) aims to extract relations between entities in a document. While previous research has primarily focused on traditional small models, recent studies have extended the scope to large language models (LLMs). Current LLM-based methods typically focus on filtering all potential relations (candidate relations) within a document at one time and then performing triplet fact extraction. However, most approaches for candidate relation filtering are based on the document level, which results in insufficient correlation between candidate relations and entity pairs. In addition, the data imbalance problem caused by a large amount of no-relation data (NA problem) is another important reason for the suboptimal performance of LLM-based methods. To address these issues, we propose an entity pair-guided relation summarization and retrieval model (EP-RSR) for DocRE, which introduces an innovative LLM-based document-level relation extraction paradigm, EPRF (Entity Pair-Relation-Fact), along with an entity pair-level candidate relation filtering method. Our approach first selects entity pairs that potentially contain relations and uses them to guide relation summarization and retrieval for extracting relation facts. This enhances the relevance between candidate relations and entity pairs while alleviating the issue of imbalanced NA data. Benchmark testing on three datasets demonstrates that our approach achieves state-of-the-art (SOTA) performance for LLM-based models. Our code is available at https://github.com/LookingYu/EP-RSR.

pdf bib abs
An Adaptive Multi-Threshold Loss and a General Framework for Collaborating Losses in Document-Level Relation Extraction
Huangming Xu | Fu Zhang | Jingwei Cheng
Findings of the Association for Computational Linguistics: ACL 2025

The goal of document-level relation extraction (DocRE) is to identify relations for a given entity pair within a document. As a multilabel classification task, the most commonly employed method involves introducing an adaptive threshold. Specifically, for an entity pair, if the scores of predicted relations exceed the threshold, the relations exist. However, we observe two phenomena that significantly weaken the model’s performance in DocRE: (1) as the label space (the number of relations) expands, the model’s performance gradually declines; (2) the model tends to prioritize predicting high-frequency relations in the long-tail problem. To address these challenges, we propose an innovative **A**daptive **M**ulti-**T**hreshold **L**oss (AMTL), which for the first time proposes to partition the label space into different sub-label spaces (thus reducing its overall size) and learn an adaptive threshold for each sub-label space. This approach allows for more precise tuning of the model’s sensitivity to diverse relations, mitigating the performance degradation associated with label space expansion and the long-tail problem. Moreover, our adaptive multi-threshold method can be considered as a general framework that seamlessly integrates different losses in different sub-label spaces, facilitating the concurrent application of multiple losses. Experimental results demonstrate that AMTL significantly enhances the performance of existing DocRE models across four datasets, achieving state-of-the-art results. The experiments on the concurrent application of multiple losses with our framework show stable performance and outperform single-loss methods. Code is available at https://github.com/xhm-code/AMTL.

pdf bib abs
Rethinking the Role of LLMs for Document-level Relation Extraction: a Refiner with Task Distribution and Probability Fusion
Fu Zhang | Xinlong Jin | Jingwei Cheng | Hongsen Yu | Huangming Xu
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Document-level relation extraction (DocRE) provides a broad context for extracting one or more relations for each entity pair. Large language models (LLMs) have made great progress in relation extraction tasks. However, one of the main challenges we face is that LLMs have difficulty in multi-label relation prediction tasks. Additionally, another noteworthy challenge and discovery we reveal: the small language models (SLMs) for DocRE tend to classify existing relations as ”no relation” (NA), while LLMs tend to predict existing relations for all entity pairs. To address these challenges, we propose a novel method that utilizes LLMs as a refiner, employing task distribution and probability fusion. The task distribution we carefully designed aims to distinguish hard and easy tasks, and feed hard tasks to our LLMs-based framework to reevaluate and refine. Further, in order to effectively solve the multi-label relation prediction problem in the refinement process, we propose a probability fusion method, ensuring and enhancing fusion predictions by maintaining a balance between SLMs and LLMs. Extensive experiments on widely-used datasets demonstrate that our method outperforms existing LLMbased methods without fine-tuning by an average of 25.2% F1. Refining SLMs using our method consistently boosts the performance of the SLMs, achieving new state-of-the-art results compared to existing SLMs and LLMs. Our code: https://github.com/Drasick/Drell.

Co-authors

Venues

Fix author