Data Augmentation for Named Entity Recognition in Domain-Specific Scenarios in Portuguese
Higor Moreira, Patricia Ferreira da Silva, Luciana Bencke, Viviane Moreira
Abstract
Named Entity Recognition (NER) is an important task of Natural Language Processing. Achieving good results in this task usually requires a large amount of labeled data to train models. This is especially difficult in domain-specific datasets and low-resourced languages. To mitigate the high cost of human-annotated data, data augmentation can be used. In this work, we evaluate Data Augmentation techniques for NER, focusing on domain-specific datasets in Portuguese.We employed augmentation techniques based on rules, back-translation, and large language models on four datasets of varying sizes to train Transformer-based NER models.The results showed that most techniques improved over the baseline, with the best results achieved using PP-LLM, SR, and MR.- Anthology ID:
- 2026.propor-1.25
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 250–259
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.25/
- DOI:
- Cite (ACL):
- Higor Moreira, Patricia Ferreira da Silva, Luciana Bencke, and Viviane Moreira. 2026. Data Augmentation for Named Entity Recognition in Domain-Specific Scenarios in Portuguese. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 250–259, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Data Augmentation for Named Entity Recognition in Domain-Specific Scenarios in Portuguese (Moreira et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.25.pdf