Authorization-First Retrieval: Enforcing Least Privilege in Multi-Agent RAG Systems

Rohith Namboothiri


Abstract
Retrieval-augmented generation systems serving multiple users under role-based access control face a trustworthiness gap: semantic retrieval operates on embedding similarity rather than authorization predicates and can introduce unauthorized content into a model’s context window before any filter intervenes. We formalize this as a pipeline ordering problem and introduce Authorization-First Retrieval (AFR), an architectural invariant requiring that authorization constrain the retrieval candidate set before any learned component consumes retrieved content. We reduce authorization correctness to the classical noninterference property and prove AFR is necessary whenever the processing model violates noninterference—a condition our experiments confirm empirically. Evaluation on a controlled corpus of 247 chunks across 232 documents with 431 base queries spanning 12 enterprise roles and 9 domains (584 total queries including negation exploitation and parametric probes) shows that retrieve-then-filter pipelines expose unauthorized context in 86.1% of queries, while AFR eliminates structural leaks by construction. Cross-model experiments with Gemini 2.0 Flash and GPT-4o-mini reveal that structural exposure is an architectural property independent of the underlying model, whereas behavioral defenses fail at model-dependent rates, producing answer leakage of 41.3% and 29.5% respectively under retrieve-then-filter. A negation exploitation study demonstrates consistent disclosure vulnerabilities across framing types, while a metadata-tag freshness ablation shows that conditional authorization mechanisms degrade under realistic policy staleness. Stress tests across retrieval depths and chunking granularities confirm AFR’s robustness. Our results demonstrate that behavioral guardrails and metadata tagging cannot reliably enforce least privilege in RAG pipelines, while authorization-first architectures provide a verifiable and model-independent security guarantee.
Anthology ID:
2026.trustnlp-main.15
Volume:
Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026)
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Kai-Wei Chang, Ninareh Mehrabi, Satyapriya Krishna, Anubrata Das, Jwala Dhamala, Yang Trista Cao, Tharindu Kumarage, Anil Ramakrishna, Christos Christodoulopoulos, Yixin Wan, Aram Galystan, Anoop Kumar, Rahul Gupta
Venues:
TrustNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
256–271
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.trustnlp-main.15/
DOI:
Bibkey:
Cite (ACL):
Rohith Namboothiri. 2026. Authorization-First Retrieval: Enforcing Least Privilege in Multi-Agent RAG Systems. In Proceedings of the 6th Workshop on Trustworthy NLP (TrustNLP 2026), pages 256–271, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
Authorization-First Retrieval: Enforcing Least Privilege in Multi-Agent RAG Systems (Namboothiri, TrustNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.trustnlp-main.15.pdf