Rafael Oleques Nunes

Also published as: Rafael Oleques Nunes

2026

Uso de técnicas de Aprendizado de Máquina e Modelos de Língua de Larga Escala para avaliação automática de textos do exame Celpe-Bras
Rafael Oleques Nunes | Bernardo Cobalchini Zietolie | Ricardo Zanini De Costa | Rodrigo Brock da Silva | João Victor Piardi Pacheco | Rafaela Dall'Agnol da Rocha | Dennis Giovani Balreira | Elisa Marchioro Stumpf | Juliana Roquele Schoffen
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

O Celpe-Bras é o exame oficial brasileiro de proficiência em Português como Língua Adicional (Inep, 2020). A parte escrita do exame exige que os participantes produzam quatro textos em resposta a tarefas baseadas em vídeo, áudio e textos de insumo, o que exige que a preparação para o exame seja realizada a partir de práticas de (re)escrita de textos. Por um lado, professores que trabalham na preparação de estudantes para o exame têm um alto volume de textos para corrigir, e os estudantes têm poucas opções de recursos didáticos acessíveis alinhados ao construto teórico do Celpe-Bras. Nesse contexto, e impulsionado pelos recentes avanços no Processamento de Linguagem Natural (PLN), modelos de língua de grande escala (LLMs) e Inteligência Artificial, este estudo visa mapear e comparar métodos para a avaliação automática dos textos produzidos no exame Celpe-Bras. São apresentados e testados diversos modelos, abrangendo tanto algoritmos tradicionais de aprendizado de máquina quanto modelos de linguagem pré-treinados, como BERT, BART e T5. Ao final, foi possível perceber que os melhores resultados foram obtidos pelas adaptações do modelo BERT, levemente superiores aos dos modelos restantes, mas com considerável maior custo computacional.

pdf bib abs

Evaluating Small Language Models for English-to-Portuguese Translation: Impact of Model Scale and Quantization
Gustavo Lopes Tamiosso | Rafael Oleques Nunes | Dennis Giovani Balreira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

Small language models (SLMs) are increasingly adopted for machine translation due to their lower computational and deployment costs, yet a focused and systematic evaluation for English-to-Portuguese remains limited. We benchmarked dozens of SLMs (135M–20B parameters) across multiple architectures and quantization schemes (FP16, Q8_0, Q4_K_M) on two datasets: FLORES-101 (Portuguese subset, 1,012 sentences) and the multidomain OPUS-100 dataset (~10k sentences). We computed lexical and semantic metrics (BLEU, chrF, and BERTScore) and assessed statistical differences using non-parametric Friedman tests over paired sentence-level scores, followed by Wilcoxon signed-rank post-hoc comparisons with Holm correction. Normality assumptions are evaluated using the Shapiro–Wilk test. Our results strongly suggest that 8-bit quantization (Q8_0) preserves semantic quality with negligible average loss, while 4-bit quantization (Q4_K_M) reaches statistical significance in roughly half of model configurations, paired effect sizes (Cliff’s δ) remain negligible to small in magnitude, with measurable degradation concentrated in lower-capacity models. Model scale exhibits only a weak correlation with translation quality: medium-sized models can match or outperform larger ones depending on model family and pretraining. These findings highlight trade-offs between efficiency and quality and inform the design of practical English–to-Portuguese translation pipelines based on SLMs.

pdf bib abs

Challenges in Image-Caption Association in Portuguese: Evaluating the CLIP Model on the FM30K Dataset
Vitória Colonetti Benedet | Gutavo Lopes Tamiosso | Rafael Oleques Nunes | Dennis Giovani Balreira
Proceedings of the Fifteenth Language Resources and Evaluation Conference

In recent decades, multimodal models such as CLIP have achieved significant advances in associating images and texts. However, most of these advances stem from models trained almost exclusively in English, which limits their effectiveness in other languages. This challenge is particularly relevant for Brazilian Portuguese, a language that still lacks dedicated multimodal resources and relies predominantly on automatic translations. This work investigates the performance of CLIP-based multimodal models in the task of associating images and descriptions written in Brazilian Portuguese. The analysis begins with a zero-shot scenario, in which different CLIP variants are directly evaluated on the FM30k dataset, composed of images and captions originally written in Portuguese. An additional experiment with automatic translations is also conducted to examine the impact of language on cross-modal retrieval tasks. Subsequently, fine-tuning is performed on the textual encoder of the ViT-B/32 model, keeping the visual encoder frozen, with the goal of adapting the model to the target language. The results show that models originally trained in English perform worse in Portuguese, while linguistically adapted variants, either multilingual or Portuguese-specific, achieve superior performance. The proposed fine-tuning approach was able to reduce this performance gap, leading to notable improvements. In the image-to-text scenario, the model achieved an absolute increase of 27.65 percentage points in the Accuracy@1 metric, representing a 209% relative gain over the original CLIP ViT-B/32. In the text-to-image scenario, the gain was 15.47 percentage points, amounting to an even higher 385% relative improvement, contributing to a more balanced association between images and captions.

pdf bib abs

From Bones to Rocks: A Systematic Evaluation of Specialized Definition Generation for Portuguese
Rafael Oleques Nunes | Dennis Giovani Balreira | Joel Luís Carbonera
Proceedings of the Fifteenth Language Resources and Evaluation Conference

This work presents a systematic evaluation of Large Language Models (LLMs) for generating specialized definitions in Portuguese, focusing on the medical and geological domains. We introduce a robust benchmark and employ a rigorous, statistically grounded evaluation framework, including 5-fold cross-validation and significance testing, to ensure the reliability and generalizability of our findings. Our comprehensive experiments with various open-source, decoder-only LLMs explore in-context learning (ICL) with diverse prompting strategies, ranging from zero-shot to few-shot and contextual information. The evaluated models include multilingual architectures and one model that underwent continued pretraining specifically for Portuguese, allowing us to assess the impact of language adaptation on definition generation quality. The results indicate that most evaluated models perform effectively in this task, with relatively small performance differences among the top models. Statistical analyses confirmed that these differences are not consistently significant, suggesting that several open LLMs, regardless of their size, multilingual capacity, or language specialization, offer comparable effectiveness for Portuguese definition generation. These findings provide valuable insights for selecting and adapting models for specialized NLP tasks in low-resource languages like Portuguese.

pdf bib abs

Geological Text Summarization Using Generative Large Language Models
Matheus Stein de Aguiar | Rafael Oleques Nunes | Dennis Giovani Balreira
Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1

Large generative language models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks. However, the geological domain presents unique challenges for NLP due to its specialized language, which is full of technical terms. Therefore, pre-trained language models on generic corpora may not be suitable for performing geological domain-specific tasks. This article compares several models to identify those with the best performance in the Portuguese geological domain for a text summarization task. We applied the models to a Revista Geologia USP dataset. The dataset consists of abstracts of scientific texts and their respective titles, which we aim for the models to approximate with the summarization task. We tested the models in various scenarios, providing examples or not, and at two temperature levels. We then evaluated the models’ performance using quantitative metrics and a brief qualitative analysis comparing the titles proposed by the models with the original title. The results show that the Gemma3:27b model was better in some scenarios, while the Llama3:8b model performed best in others.

pdf bib abs

With the growing availability of large text collections, efficient tools for corpus annotation and normalization have become increasingly important in linguistic and computational research. This paper presents CorSpell, a semiautomatic tool developed to support the spelling normalization of Brazilian Portuguese texts within the CorCel project—a corpus comprising over 15,000 handwritten exam responses from the Celpe-Bras proficiency test. Given the corpus scale, manual normalization is impractical; CorSpell streamlines this process by enabling users to visualize, select, and replace tokens directly through an intuitive web interface. The tool integrates automatic suggestions from PT-BR dictionaries with human validation, providing an interface for users to access and manipulate the texts. CorSpell significantly reduces annotation time, minimizes errors, and facilitates collaborative work, providing a practical and scalable solution for corpus normalization and a foundation for LLM-based modeling of Portuguese proficiency.

2024

pdf bib

A Named Entity Recognition Approach for Portuguese Legislative Texts Using Self-Learning
Rafael Oleques Nunes | Dennis Giovani Balreira | André Suslik Spritzer | Carla Maria Dal Sasso Freitas
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 1