OG-RAG: Ontology-grounded retrieval-augmented generation for large language models

Kartik Sharma, Peeyush Kumar, Yunqing Li


Abstract
While LLMs are widely used for generic tasks like question answering and search, they struggle to adapt to specialized knowledge, such as industrial workflows in healthcare, legal, and agricultural sectors, as well as knowledge-driven tasks such as news journalism, investigative research, and consulting without expensive fine-tuning or sub-optimal retrieval methods. Existing retrieval-augmented models, such as RAG, offer improvements but fail to account for structured domain knowledge, leading to suboptimal context generation. Ontologies, which conceptually organize domain knowledge by defining entities and their interrelationships, offer a structured representation to address this gap. This paper presents OG-RAG, an Ontology-Grounded Retrieval Augmented Generation method designed to enhance LLM-generated responses by anchoring retrieval processes in domain-specific ontologies. OG-RAG constructs a hypergraph representation of domain documents, where each hyperedge encapsulates clusters of factual knowledge grounded using domain-specific ontology and retrieves a minimal set of hyperedges for a given query using an optimization algorithm. Our evaluations demonstrate that OG-RAG increases the recall of accurate facts by 55% and improves response correctness by 40% across four different LLMs. Additionally, OG-RAG enables 30% faster attribution of responses to context and boosts fact-based reasoning accuracy by 27% compared to baseline methods. We release the code at [https://github.com/microsoft/ograg2](https://github.com/microsoft/ograg2).
Anthology ID:
2025.emnlp-main.1674
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
32950–32969
Language:
URL:
https://preview.aclanthology.org/ingest-luhme/2025.emnlp-main.1674/
DOI:
10.18653/v1/2025.emnlp-main.1674
Bibkey:
Cite (ACL):
Kartik Sharma, Peeyush Kumar, and Yunqing Li. 2025. OG-RAG: Ontology-grounded retrieval-augmented generation for large language models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 32950–32969, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
OG-RAG: Ontology-grounded retrieval-augmented generation for large language models (Sharma et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-luhme/2025.emnlp-main.1674.pdf
Checklist:
 2025.emnlp-main.1674.checklist.pdf