Thesis Proposal: On the Granularity-Robustness Trade-off in Text-Derived Knowledge Graphs

Surawat Pralomram


Abstract
Retrieval-augmented generation (RAG) based on dense embeddings has become a dominant paradigm for text retrieval. However, many real-world applications require attribute-specific querying, where explicit values or properties must be extracted from text (e.g., symptoms in clinical notes or dosage values in medical reports). Dense retrieval handles paraphrastic variation well but often entangles multiple attributes within a single embedding, making value extraction difficult. Knowledge graphs (KGs), in contrast, support explicit attribute access but are brittle under linguistic and structural variation, leading to low recall.This thesis proposal aims to investigate the representational trade-off underlying these approaches. We study knowledge graph representations from an information-theoretic and optimal coding perspective, focusing on the tension between fine-grained factorization and compact canonicalization of concepts. Building on this perspective, we propose a query-driven framework for constructing and retrieving knowledge graphs from text, aiming to combine the robustness of dense retrieval with the explicit queryability of symbolic representations.
Anthology ID:
2026.acl-srw.17
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
173–187
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.17/
DOI:
Bibkey:
Cite (ACL):
Surawat Pralomram. 2026. Thesis Proposal: On the Granularity-Robustness Trade-off in Text-Derived Knowledge Graphs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 173–187, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Thesis Proposal: On the Granularity-Robustness Trade-off in Text-Derived Knowledge Graphs (Pralomram, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.17.pdf