Kaijing Ma
2025
KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model’s Reasoning Path Aggregation
Siyuan Fang
|
Kaijing Ma
|
Tianyu Zheng
|
Xeron Du
|
Ningxuan Lu
|
Ge Zhang
|
Qingkun Tang
Findings of the Association for Computational Linguistics: ACL 2025
Large language models (LLMs) demonstrate exceptional performance across a variety of tasks, yet they are often affected by hallucinations and the timeliness of knowledge. Leveraging knowledge graphs (KGs) as external knowledge sources has emerged as a viable solution, but existing methods for LLM-based knowledge graph question answering (KGQA) are often limited by step-by-step decision-making on KGs, restricting the global planning and reasoning capabilities of LLMs, or they require fine-tuning or pre-training on specific KGs. To address these challenges, we propose Knowledge graph Assisted Reasoning Path Aggregation (KARPA), a novel framework that harnesses the global planning abilities of LLMs for efficient and accurate KG reasoning. KARPA operates in three steps: pre-planning relation paths using the LLM’s global planning capabilities, matching semantically relevant paths via an embedding model, and reasoning over these paths to generate answers. Unlike existing KGQA methods, KARPA avoids stepwise traversal, requires no additional training, and is adaptable to various LLM architectures. Extensive experimental results show that KARPA achieves state-of-the-art performance in KGQA tasks, delivering both high efficiency and accuracy.
2024
SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval
Siwei Wu
|
Yizhi Li
|
Kang Zhu
|
Ge Zhang
|
Yiming Liang
|
Kaijing Ma
|
Chenghao Xiao
|
Haoran Zhang
|
Bohao Yang
|
Wenhu Chen
|
Wenhao Huang
|
Noura Al Moubayed
|
Jie Fu
|
Chenghua Lin
Findings of the Association for Computational Linguistics: ACL 2024
Multi-modal information retrieval (MMIR) is a rapidly evolving field where significant progress has been made through advanced representation learning and cross-modality alignment research, particularly in image-text pairing.However, current benchmarks for evaluating MMIR performance on image-text pairings overlook the scientific domain, which has a notable gap with the generic data since the caption of scientific charts and tables usually describes the analysis of experimental results or scientific principles in contrast to human activity or scenery depicted in generic images.To bridge this gap, we develop a scientific domain-specific MMIR benchmark (SciMMIR) by leveraging open-access research paper corpora to extract data relevant to the scientific domain. This benchmark comprises 530K meticulously curated image-text pairs, extracted from figures and tables with detailed captions from scientific documents.We further annotate the image-text pairs with a two-level subset-subcategory hierarchy to facilitate a more comprehensive evaluation of the baselines. We conduct zero-shot and fine-tuned evaluations on prominent multi-modal image-captioning and visual language models, such as CLIP, BLIP, and BLIP-2.Our findings offer critical insights for MMIR in the scientific domain, including the impact of pre-training and fine-tuning settings and the effects of different visual and textual encoders.
Search
Fix author
Co-authors
- Ge Zhang 2
- Noura Al Moubayed 1
- Wenhu Chen 1
- Xeron Du 1
- Siyuan Fang 1
- show all...