Not all Hallucinations are Good to Throw Away When it Comes to Legal Abstractive Summarization
Nihed Bendahman, Karen Pinel-Sauvagnat, Gilles Hubert, Mokhtar Boumedyen Billami
Abstract
Automatic summarization of legal documents requires a thorough understanding of their specificities, mainly with respect to the vocabulary used by legal experts. Indeed, the latter rely heavily on their external knowledge when writing summaries, in order to contextualize the main entities of the source document. This leads to reference summaries containing many abstractions, that sota models struggle to generate. In this paper, we propose an entity-driven approach aiming at learning the model to generate factual hallucinations, as close as possible to the abstractions of the reference summaries. We evaluated our approach on two different datasets, with legal documents in English and French. Results show that our approach allows to reduce non-factual hallucinations and maximize both summary coverage and factual hallucinations at entity-level. Moreover, the overall quality of summaries is also improved, showing that guiding summarization with entities is a valuable solution for legal documents summarization.- Anthology ID:
- 2025.naacl-long.275
- Volume:
- Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5331–5344
- Language:
- URL:
- https://preview.aclanthology.org/landing_page/2025.naacl-long.275/
- DOI:
- Cite (ACL):
- Nihed Bendahman, Karen Pinel-Sauvagnat, Gilles Hubert, and Mokhtar Boumedyen Billami. 2025. Not all Hallucinations are Good to Throw Away When it Comes to Legal Abstractive Summarization. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5331–5344, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Not all Hallucinations are Good to Throw Away When it Comes to Legal Abstractive Summarization (Bendahman et al., NAACL 2025)
- PDF:
- https://preview.aclanthology.org/landing_page/2025.naacl-long.275.pdf