Biatron: A Parameter-Efficient Small Language Model for Brazilian Portuguese with Integrated Mathematical Reasoning
Daniel Fazzioni, Maria C. X. de Almeida, Anna P. V. L. B. Moreira, Anderson S. Soares, Sávio S. T. de Oliveira, Fernando M. Federson
Abstract
The development of Small Language Models (SLMs) for Portuguese faces significant challenges in balancing parameter efficiency with specialized capabilities, particularly in mathematical reasoning domains where existing models demonstrate limited native competence. This work introduces the first model in the Biatron series, a 345-million-parameter language model specifically optimized for Brazilian Portuguese through strategic data curation rather than brute-force parameter scaling. Using a carefully designed 60-30-10 data mixture combining high-quality Portuguese text from GigaVerbo, chain-of-thought reasoning examples, and mathematical datasets, Biatron was trained on 300 billion tokens using the Megatron-LM framework, achieving 32% Model FLOP Utilization.The model attains an overall score of 0.245 (aggregate performance) on Portuguese-specific benchmarks, approaching within 1.6% of Tucano-630M’s performance while utilizing 45% fewer parameters. Most significantly, Biatron achieves 7.5% Pass@1 accuracy on mathematical reasoning tasks—more than doubling the performance of Tucano-2.4B (3.5%) despite being nearly seven times smaller. These results validate that strategic data mixing can rival parameter scaling for language model development, establishing a reproducible methodology for efficient AI development in resource constrained language contexts. To support reproducibility and further research, the final model weights, training logs, and intermediate checkpoints are publicly available.- Anthology ID:
- 2026.propor-1.86
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 868–877
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.86/
- DOI:
- Cite (ACL):
- Daniel Fazzioni, Maria C. X. de Almeida, Anna P. V. L. B. Moreira, Anderson S. Soares, Sávio S. T. de Oliveira, and Fernando M. Federson. 2026. Biatron: A Parameter-Efficient Small Language Model for Brazilian Portuguese with Integrated Mathematical Reasoning. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 868–877, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Biatron: A Parameter-Efficient Small Language Model for Brazilian Portuguese with Integrated Mathematical Reasoning (Fazzioni et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.86.pdf