Learning from Litigation: Graphs for Retrieval and Reasoning in eDiscovery

Sounak Lahiri, Sumit Pai, Tim Weninger, Sanmitra Bhattacharya


Abstract
Electronic Discovery (eDiscovery) requires identifying relevant documents from vast collections for legal production requests. While artificial intelligence (AI) and natural language processing (NLP) have improved document review efficiency, current methods still struggle with legal entities, citations, and complex legal artifacts. To address these challenges, we introduce DISCOvery Graph (DISCOG), an emerging system that integrates knowledge graphs for enhanced document ranking and classification, augmented by LLM-driven reasoning. DISCOG outperforms strong baselines in F1-score, precision, and recall across both balanced and imbalanced datasets. In real-world deployments, it has reduced litigation-related document review costs by approximately 98%, demonstrating significant business impact.
Anthology ID:
2025.acl-industry.46
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Georg Rehm, Yunyao Li
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
661–671
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-industry.46/
DOI:
Bibkey:
Cite (ACL):
Sounak Lahiri, Sumit Pai, Tim Weninger, and Sanmitra Bhattacharya. 2025. Learning from Litigation: Graphs for Retrieval and Reasoning in eDiscovery. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 661–671, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Learning from Litigation: Graphs for Retrieval and Reasoning in eDiscovery (Lahiri et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-industry.46.pdf