Negation-Aware Data Augmentation for Portuguese Natural Language Inference
Maria Cecília M. Corrêa, Felipe S. F. Paula, Matheus Westhelle, Viviane P. Moreira
Abstract
Negation plays a fundamental role in human communication and logical reasoning, yet it remains underrepresented in natural language inference (NLI) datasets. This work investigates the impact of targeted data augmentation using negation cues on the main NLI datasets for Portuguese (InferBR, ASSIN and ASSIN2). By synthetically generating new instances with negated hypotheses, we create more diverse training and test sets. A BERT-based model was fine-tuned and tested on the combined datasets and augmented data. The results show that the model was heavily influenced by the bias in the use of negation, and increased data diversity improves the model’s handling of negation.- Anthology ID:
- 2026.propor-1.14
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 141–150
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.14/
- DOI:
- Cite (ACL):
- Maria Cecília M. Corrêa, Felipe S. F. Paula, Matheus Westhelle, and Viviane P. Moreira. 2026. Negation-Aware Data Augmentation for Portuguese Natural Language Inference. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 141–150, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Negation-Aware Data Augmentation for Portuguese Natural Language Inference (Corrêa et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.14.pdf