P3B3: A Multi-Turn Conversational Benchmark for Measuring European and Brazilian Portuguese Variety Bias in LLMs
Rafael Ferreira, Inês Vieira, Inês Calvo, James Furtado, Iago Paulo, Diogo Glória-Silva, Diogo Tavares, David Semedo, Joao Magalhaes
Abstract
As Large Language Models (LLMs) become embedded in everyday communication, capturing regional linguistic variation is essential for reliable and equitable language use. In Portuguese, European (pt-PT) and Brazilian (pt-BR) varieties remain unevenly represented, with pt-BR dominating in data quantity, while LLM preference for Portuguese variants remains underexplored.To address this gap, we introduce P3B3, an expert-curated variety agnostic benchmark of conversational prompts, along with an evaluation framework for measuring variety bias and controllability.Experiments on several models show that most LLMs exhibit a strong bias toward pt-BR, with variation in controllability across models. These results highlight the need for more balanced multilingual representation across language varieties.- Anthology ID:
- 2026.mellm-1.23
- Volume:
- Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM 2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, United States
- Editors:
- Kaiyu Huang, Fengran Mo, Pinzhen Chen, Meng Jiang
- Venues:
- MeLLM | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 240–248
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.mellm-1.23/
- DOI:
- Cite (ACL):
- Rafael Ferreira, Inês Vieira, Inês Calvo, James Furtado, Iago Paulo, Diogo Glória-Silva, Diogo Tavares, David Semedo, and Joao Magalhaes. 2026. P3B3: A Multi-Turn Conversational Benchmark for Measuring European and Brazilian Portuguese Variety Bias in LLMs. In Proceedings of the 1st Workshop on Multilinguality in the Era of Large Language Models (MeLLM 2026), pages 240–248, San Diego, United States. Association for Computational Linguistics.
- Cite (Informal):
- P3B3: A Multi-Turn Conversational Benchmark for Measuring European and Brazilian Portuguese Variety Bias in LLMs (Ferreira et al., MeLLM 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.mellm-1.23.pdf