Hang Yu

Other people with similar names: Hang Yu


2025

pdf bib
HDiff: Confidence-Guided Denoising Diffusion for Robust Hyper-relational Link Prediction
Xiangfeng Luo | Ruoxin Zheng | Jianqiang Huang | Hang Yu
Findings of the Association for Computational Linguistics: EMNLP 2025

Although Hyper-relational Knowledge Graphs (HKGs) can model complex facts better than traditional KGs, the Hyper-relational Knowledge Graph Completion (HKGC) is more sensitive to inherent noise, particularly struggling with two prevalent HKG-specific noise types: Intra-fact Inconsistency and Cross-fact Association Noise.To address these challenges, we propose **HDiff**, a novel conditional denoising diffusion framework for robust HKGC that learns to reverse structured noise corruption. HDiff integrates a **Consistency-Enhanced Global Encoder (CGE)** using contrastive learning to enforce intra-fact consistency and a **Context-Guided Denoiser (CGD)** performing iterative refinement. The CGD features dual conditioning leveraging CGE’s global context and local confidence estimates, effectively combatting both noise types. Extensive experiments demonstrate that HDiff substantially outperforms state-of-the-art HKGC methods, highlighting its effectiveness and significant robustness, particularly under noisy conditions.

pdf bib
CLEAR: A Framework Enabling Large Language Models to Discern Confusing Legal Paragraphs
Qi Xu | Qian Liu | Hao Fei | Hang Yu | Shuhao Guan | Xiao Wei
Findings of the Association for Computational Linguistics: EMNLP 2025

Most of the existing work focuses on enabling LLMs to leverage legal rules (, law articles) to tackle complex legal reasoning tasks, but ignores their ability to understand legal rules. To better evaluate the LLMs’ capabilities on the task, in this work, we propose a new challenge task: Legal Paragraph Prediction (LPP), which aims to predict the legal paragraph given criminal facts. Moreover, to enhance the legal reasoning ability of LLMs, we propose a novel framework CLEAR, enabling LLMs to analyze legal cases with the guidance of legal rule insights. The CLEAR contains four key components, where the Legal Rules Retriever aims to retrieve legal rule knowledge, and the Rule Insights Generator is used to generate legal insights guiding the LLM’s reasoning, then the Case Analyzer analyze the case with the guidance of legal rule insights given criminal facts. Finally, the Legal Reasoner synthesizes the criminal facts, legal rule insights, and analysis results to derive the final decision. By conducting extensive experiments on a real-world dataset, experimental results validate the effectiveness of our proposed model. Our codes and dataset are available at https://anonymous.4open.science/r/CLEAR-3048.

2024

pdf bib
Guided Knowledge Generation with Language Models for Commonsense Reasoning
Xiao Wei | Haoran Chen | Hang Yu | Hao Fei | Qian Liu
Findings of the Association for Computational Linguistics: EMNLP 2024

Large Language Models (LLMs) have achieved notable success in commonsense reasoning tasks, benefiting from their extensive world knowledge acquired through extensive pretraining. While approaches like Chain-of-Thought (CoT) have shown promise in enhancing LLMs’ reasoning capabilities, mitigating the influence of inaccurate commonsense knowledge remains a challenge, particularly for small-scale LLMs (e.g., those with less than 10B parameters). In this work, we propose a novel method named Guided Knowledge Generation (GuideKG) to address these issues. It presents three advantages: (i) Employing LLMs to generate knowledge explanations and to automatically assign labels based on the probability of correct answers eliminates the need for costly manual annotation in subsequent training. (ii) Training a new module called the ‘Know-Filter’, which is used to evaluate knowledge, and we have introduced a new loss to enhance its performance. (iii) Evaluating the effectiveness of knowledge fragments at the sentence level and fusing them allows for precise control over the generation process of LLMs. We evaluate our GuideKG on small-scale LLMs and show that it outperforms all baselines on four widely-used commonsense reasoning benchmarks. Moreover, our experiments reveal that, with proper guidance, small-scale LLMs can exhibit exceptional performance in commonsense reasoning.

pdf bib
Divide and Conquer: Legal Concept-guided Criminal Court View Generation
Qi Xu | Xiao Wei | Hang Yu | Qian Liu | Hao Fei
Findings of the Association for Computational Linguistics: EMNLP 2024

The Criminal Court View Generation task aims to produce explanations that inform judicial decisions. This necessitates a nuanced understanding of diverse legal concepts, such as Recidivism, Confess, and Robbery, which often coexist within cases, complicating holistic analysis. However, existing methods mainly rely on the generation capability of language models, without paying enough attention to the important legal concepts.To enhance the precision and depth of such explanations, we introduce Legal Concept-guided Criminal Court Views Generation (LeGen), a three-stage approach designed for iterative reasoning tailored to individual legal constructs.Specifically, in the first stage, we design a decomposer to divide the court views into focused sub-views, each anchored around a distinct legal concept. Next, a concept reasoning module generates targeted rationales by intertwining the deconstructed facts with their corresponding legal frameworks, ensuring contextually relevant interpretations.Finally, a verifier and a generator are employed to align the rationale with the case fact and obtain synthesized comprehensive and legally sound final court views, respectively.We evaluate LeGen by conducting extensive experiments on a real-world dataset and experimental results validate the effectiveness of our proposed model. Our codes are available at https://anonymous.4open.science/r/LeGen-5625.