CURUPIRA: Clever guard for harm and linguistic prompt mitigation in Brazilian Portuguese
Rogério Sousa, William Alberto Cruz-Castañeda, José Roberto Homeli Silva, Marcellus Amadeus
Abstract
The safe deployment of Large Language Models remains challenging in multilingual settings, particularly when models are exposed to adversarial or malicious prompts in underrepresented languages. In this work, we present Curupira, a Brazilian Portuguese-language guard model designed to mitigate harmful prompt exploitation. To do this, we establish a three steps methodology that involves adaptation, data generation, and fine-tuning. We also evaluate our model with two state-of-the-art open guardrail architectures. The results show that targeted fine-tuning leads to consistent improvements in safety classification for Portuguese prompts, with favorable efficiency–performance trade-offs for compact models and limited degradation in cross-lingual evaluation.- Anthology ID:
- 2026.propor-1.107
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1038–1043
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.107/
- DOI:
- Cite (ACL):
- Rogério Sousa, William Alberto Cruz-Castañeda, José Roberto Homeli Silva, and Marcellus Amadeus. 2026. CURUPIRA: Clever guard for harm and linguistic prompt mitigation in Brazilian Portuguese. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 1038–1043, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- CURUPIRA: Clever guard for harm and linguistic prompt mitigation in Brazilian Portuguese (Sousa et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.107.pdf