Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain
Ana Carolina C. Bessa, Fábio M. F. Lobato, Antonio F. L. J. Junior
Abstract
The need for tools that assist in process management, automating tasks and reducing the slowness of the judicial system, justifies the improvement of traditional Information Retrieval systems, often limited by vocabulary incompatibility and the length of legal texts. Although models based on Transformers capture semantic particularities, they face input size constraints that make it difficult to process long texts without losing information. In this work, we propose a hybrid system applied to the legal domain, combining the BM25L algorithm and the BumbaLM language model.- Anthology ID:
- 2026.propor-2.26
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 186–190
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-2.26/
- DOI:
- Cite (ACL):
- Ana Carolina C. Bessa, Fábio M. F. Lobato, and Antonio F. L. J. Junior. 2026. Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 2, pages 186–190, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Development and Evaluation of a Hybrid Information Retrieval System Applied to the Brazilian Legal Domain (Bessa et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-2.26.pdf