Mitigating Misinterpretation in Policy Documents through Automated Language Understanding

Momojit Biswas; Anka Chandrahas Tummepalli; Preethu Rose Anish

Mitigating Misinterpretation in Policy Documents through Automated Language Understanding

Momojit Biswas, Anka Chandrahas Tummepalli, Preethu Rose Anish

Abstract

Policy documents often employ intricate and technical language, posing comprehension challenges for policyholders and increasing the risk of misinterpretation, financial losses, and legal disputes. To address these issues, we propose an automated framework leveraging Retrieval-Augmented Generation to identify and clarify potentially mis-interpretable paragraphs within policy documents. The framework consists of two key modules: the Annotation module and the Rectification module. The Annotation module employs both paragraph-level and document-level contextual reasoning to classify paragraphs into categories indicative of potential misinterpretation. The Rectification module resolves these ambiguities by generating targeted interpretation queries, retrieving relevant document-level context, and incorporating external knowledge sources. Applied to a corpus of 240 real-world policy documents, the Annotation module produced a benchmark dataset comprising 11,000 annotated paragraphs, enabling systematic evaluation of interpretability issues. We assessed the dataset’s quality through expert-driven manual reviews and large-scale automated evaluations using fine-tuned Pretrained Language Model. For the Rectification module, we evaluated five open-source Large Language Models: Mistral-2-7B, Mistral-3-7B, LLaMA-2-7B, LLaMA-3-8B, andSaul-7B. Among these, Mistral-2-7B achieved the highest human evaluation scores: 0.912 for Clarity, 0.914 for Fidelity, and 0.934 for Usefulness. This work demonstrates the practical feasibility of utilizing automated frameworks to enhance the clarity and comprehensibility of complex policy documents, thereby mitigating risks associated with misinterpretation and its adverse consequences.

Anthology ID:: 2026.lrec-main.651
Volume:: Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:: May
Year:: 2026
Address:: Palma de Mallorca, Spain
Editors:: Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:: LREC
SIG:
Publisher:: ELRA Language Resource Association
Note:
Pages:: 8217–8234
Language:
URL:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.651/
DOI:
Bibkey:
Cite (ACL):: Momojit Biswas, Anka Chandrahas Tummepalli, and Preethu Rose Anish. 2026. Mitigating Misinterpretation in Policy Documents through Automated Language Understanding. International Conference on Language Resources and Evaluation, main:8217–8234.
Cite (Informal):: Mitigating Misinterpretation in Policy Documents through Automated Language Understanding (Biswas et al., LREC 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.651.pdf

PDF Cite Search Fix data