LexCLiPR: Cross-Lingual Paragraph Retrieval from Legal Judgments

Rohit Upadhya, Santosh T.y.s.s


Abstract
Efficient retrieval of pinpointed information from case law is crucial for legal professionals but challenging due to the length and complexity of legal judgments. Existing works mostly often focus on retrieving entire cases rather than precise, paragraph-level information. Moreover, multilingual legal practice necessitates cross-lingual retrieval, most works have been limited to monolingual settings. To address these gaps, we introduce LexCLiPR, a cross-lingual dataset for paragraph-level retrieval from European Court of Human Rights (ECtHR) judgments, leveraging multilingual case law guides and distant supervision to curate our dataset. We evaluate retrieval models in a zero-shot setting, revealing the limitations of pre-trained multilingual models for cross-lingual tasks in low-resource languages and the importance of retrieval based post-training strategies. In fine-tuning settings, we observe that two-tower models excel in cross-lingual retrieval, while siamese architectures are better suited for monolingual tasks. Fine-tuning multilingual models on native language queries improves performance but struggles to generalize to unseen legal concepts, highlighting the need for robust strategies to address topical distribution shifts in the legal queries.
Anthology ID:
2025.acl-long.683
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13971–13993
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.683/
DOI:
Bibkey:
Cite (ACL):
Rohit Upadhya and Santosh T.y.s.s. 2025. LexCLiPR: Cross-Lingual Paragraph Retrieval from Legal Judgments. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13971–13993, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
LexCLiPR: Cross-Lingual Paragraph Retrieval from Legal Judgments (Upadhya & T.y.s.s, ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.683.pdf