Learning from Litigation: Graphs for Retrieval and Reasoning in eDiscovery
Sounak Lahiri, Sumit Pai, Tim Weninger, Sanmitra Bhattacharya
Abstract
Electronic Discovery (eDiscovery) requires identifying relevant documents from vast collections for legal production requests. While artificial intelligence (AI) and natural language processing (NLP) have improved document review efficiency, current methods still struggle with legal entities, citations, and complex legal artifacts. To address these challenges, we introduce DISCOvery Graph (DISCOG), an emerging system that integrates knowledge graphs for enhanced document ranking and classification, augmented by LLM-driven reasoning. DISCOG outperforms strong baselines in F1-score, precision, and recall across both balanced and imbalanced datasets. In real-world deployments, it has reduced litigation-related document review costs by approximately 98%, demonstrating significant business impact.- Anthology ID:
- 2025.acl-industry.46
- Volume:
- Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Georg Rehm, Yunyao Li
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 661–671
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-industry.46/
- DOI:
- Cite (ACL):
- Sounak Lahiri, Sumit Pai, Tim Weninger, and Sanmitra Bhattacharya. 2025. Learning from Litigation: Graphs for Retrieval and Reasoning in eDiscovery. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 661–671, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Learning from Litigation: Graphs for Retrieval and Reasoning in eDiscovery (Lahiri et al., ACL 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-industry.46.pdf