Jiafan Li

2025

Link prediction in knowledge graphs (KGs) requires integrating structural information and semantic context to infer missing entities. While large language models (LLMs) offer strong generative reasoning capabilities, their limited exploitation of structural signals often results in *structural sparsity* and *semantic ambiguity*, especially under incomplete or zero-shot settings. To address these challenges, we propose **SLiNT** (**S**tructure-aware **L**anguage model with **I**njection and co**N**trastive **T**raining), a modular framework that injects KG-derived structural context into a frozen LLM backbone with lightweight LoRA-based adaptation for robust link prediction. Specifically, **Structure-Guided Neighborhood Enhancement (SGNE)** retrieves pseudo-neighbors to enrich sparse entities and mitigate missing context; **Dynamic Hard Contrastive Learning (DHCL)** introduces fine-grained supervision by interpolating hard positives and negatives to resolve entity-level ambiguity; and **Gradient-Decoupled Dual Injection (GDDI)** performs token-level structure-aware intervention while preserving the core LLM parameters. Experiments on WN18RR and FB15k-237 show that SLiNT achieves superior or competitive performance compared with both embedding-based and generation-based baselines, demonstrating the effectiveness of structure-aware representation learning for scalable knowledge graph completion.

2024

pdf bib abs
Advancement in Graph Understanding: A Multimodal Benchmark and Fine-Tuning of Vision-Language Models
Qihang Ai | Jiafan Li | Jincheng Dai | Jianwu Zhou | Lemao Liu | Haiyun Jiang | Shuming Shi
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Graph data organizes complex relationships and interactions between objects, facilitating advanced analysis and decision-making across different fields. In this paper, we propose a new paradigm for interactive and instructional graph data understanding and reasoning.Instead of adopting complex graph neural models or heuristic graph-to-text instruction design, we leverage Vision-Language Models (VLMs) to encode the graph images with varying structures across different domains. This paper first evaluates the capabilities of public VLMs in graph learning from multiple aspects. Then it introduces a novel instruction-following dataset for multimodal graph understanding and reasoning in English and Chinese. Besides, by fine-tuning MiniGPT-4 and LLaVA on our dataset, we achieved an accuracy increase of 5%-15% compared to baseline models, with the best-performing model attaining scores comparable to Gemini in GPT-asissted Evaluation. This research not only showcases the potential of integrating VLMs with graph data but also opens new avenues for advancements in graph data understanding.

Co-authors

Venues

acl1
findings1

Fix author