Chun Yang

2026

N-ary knowledge graph completion (KGC) aims to infer missing components in facts with multiple entities under distinct semantic roles, commonly formulated as a knowledge hypergraph link prediction task. Most embedding-based approaches score individual hyperedges relying on enriched structural representations, but overlook intermediate propagation states containing complementary local and global structural evidence. Despite their capability to generate chain-of-thought (CoT) representations for the classical KGC task, large language models (LLMs) struggle with hypergraph structure involving multiple facts, while current hypergraph QA methods only provide LLMs with a single query signal rather than path-level evidence. These limitations hinder the transferability of existing methods, especially those leveraging LLMs, to solve the knowledge hypergraph link prediction problem. To bridge this gap, we propose HyperCoT, a structure-aware approach that models multi-hop structural reasoning as a depth-sensitive progressive evidence accumulation process. It constructs a Graphical Chain-of-Thought (Graph-CoT) by aggregating role-aware hyperedge states along strongly correlated reasoning paths, and injects the resulting path-level structural evidence into each token in query and candidate entities to prompt LLMs. Experiments on three real-world datasets demonstrate that HyperCoT consistently outperforms strong n-ary KGC baselines, particularly in high arity and structural sparsity scenarios, meanwhile yielding interpretable multi-hop reasoning traces.

2025

pdf bib abs

Link prediction in knowledge graphs (KGs) requires integrating structural information and semantic context to infer missing entities. While large language models (LLMs) offer strong generative reasoning capabilities, their limited exploitation of structural signals often results in *structural sparsity* and *semantic ambiguity*, especially under incomplete or zero-shot settings. To address these challenges, we propose **SLiNT** (**S**tructure-aware **L**anguage model with **I**njection and co**N**trastive **T**raining), a modular framework that injects KG-derived structural context into a frozen LLM backbone with lightweight LoRA-based adaptation for robust link prediction. Specifically, **Structure-Guided Neighborhood Enhancement (SGNE)** retrieves pseudo-neighbors to enrich sparse entities and mitigate missing context; **Dynamic Hard Contrastive Learning (DHCL)** introduces fine-grained supervision by interpolating hard positives and negatives to resolve entity-level ambiguity; and **Gradient-Decoupled Dual Injection (GDDI)** performs token-level structure-aware intervention while preserving the core LLM parameters. Experiments on WN18RR and FB15k-237 show that SLiNT achieves superior or competitive performance compared with both embedding-based and generation-based baselines, demonstrating the effectiveness of structure-aware representation learning for scalable knowledge graph completion.

2024

pdf bib abs

Arbitrary Time Information Modeling via Polynomial Approximation for Temporal Knowledge Graph Embedding
Zhiyu Fang | Jingyan Qin | Xiaobin Zhu | Chun Yang | Xu-Cheng Yin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Distinguished from traditional knowledge graphs (KGs), temporal knowledge graphs (TKGs) must explore and reason over temporally evolving facts adequately. However, existing TKG approaches still face two main challenges, i.e., the limited capability to model arbitrary timestamps continuously and the lack of rich inference patterns under temporal constraints. In this paper, we propose an innovative TKGE method (PTBox) via polynomial decomposition-based temporal representation and box embedding-based entity representation to tackle the above-mentioned problems. Specifically, we decompose time information by polynomials and then enhance the model’s capability to represent arbitrary timestamps flexibly by incorporating the learnable temporal basis tensor. In addition, we model every entity as a hyperrectangle box and define each relation as a transformation on the head and tail entity boxes. The entity boxes can capture complex geometric structures and learn robust representations, improving the model’s inductive capability for rich inference patterns. Theoretically, our PTBox can encode arbitrary time information or even unseen timestamps while capturing rich inference patterns and higher-arity relations of the knowledge base. Extensive experiments on real-world datasets demonstrate the effectiveness of our method.

Co-authors

Venues

Fix author