MOD-KG: MultiOrgan Diagnosis Knowledge Graph

Anas Anwarul Haq Khan, Pushpak Bhattacharyya


Abstract
The human body is highly interconnected, where a diagnosis in one organ can influence conditions in others. In medical research, graphs (such as Knowledge Graphs and Causal Graphs) have proven useful for capturing these relationships, but constructing them manually with expert input is both costly and time-intensive, especially given the continuous flow of new findings. To address this, we leverage the extraction capabilities of large language models (LLMs) to build the **MultiOrgan Diagnosis Knowledge Graph (MOD-KG)**. MOD-KG contains over **21,200 knowledge triples**, derived from both textbooks **(~13%)** and carefully selected research papers (with an average of **444** citations each). The graph focuses primarily on the *heart, lungs, kidneys, liver, pancreas, and brain*, which are central to much of today’s multimodal imaging research. The extraction quality of the LLM was benchmarked against baselines over **1000** samples, demonstrating reliability. We will make our dataset public upon acceptance.
Anthology ID:
2025.nlpai4health-main.2
Volume:
NLP-AI4Health
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Parameswari Krishnamurthy, Vandan Mujadia, Dipti Misra Sharma, Hannah Mary Thomas
Venues:
NLP-AI4Health | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9–15
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.nlpai4health-main.2/
DOI:
Bibkey:
Cite (ACL):
Anas Anwarul Haq Khan and Pushpak Bhattacharyya. 2025. MOD-KG: MultiOrgan Diagnosis Knowledge Graph. In NLP-AI4Health, pages 9–15, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
MOD-KG: MultiOrgan Diagnosis Knowledge Graph (Khan & Bhattacharyya, NLP-AI4Health 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.nlpai4health-main.2.pdf