Rafael Fernandes


2026

We describe the USP system for the AmericasNLP 2026 Shared Task on Culturally Relevant Image Captioning for Indigenous Languages, covering Guaraní (grn), Maya Yucateco (yua), Nahuatl (nah), Wixárika (hch), and Bribri (bzd). We propose a two-stage cascade: Qwen3-VL-8B-Instruct (Bai et al., 2025) generates Spanish captions via language-specific cultural prompts; language-specific fine-tuned NLLB-200-distilled-600M (NLLB Team et al., 2022) models then translate them into each target language. We train on AmericasNLP 2023 data (Ebrahimi et al., 2023) augmented with public parallel corpora. Our system achieves competitive results, including 3rd place in Guaraní human evaluation (2.41/5.0) and 5th in Bribri (1.09/5.0) among 8 teams. We also report that NLLB-200-distilled-600M silently lacks vocabulary entries for Bribri and Maya Yucateco, producing English output without error.

2024

This research investigates the challenges of translating spatial language using open-source LLMs versus traditional NMTs. Focusing on spatial prepositions like ACROSS, INTO, ONTO, and THROUGH, which are particularly challenging for the EN-PT-br pair, the study evaluates translations using BLEU, METEOR, BERTScore, COMET, and TER metrics, along with manual error analysis. The findings reveal that moderate-sized LLMs, such as LLaMa-3-8B and Mixtral-8x7B, achieve accuracy comparable to NMTs like DeepL. However, LLMs frequently exhibit mistranslation errors, including interlanguage/code-switching and anglicisms, while NMTs demonstrate better fluency. Both LLMs and NMTs struggle with spatial-related errors, including syntactic projections and polysemy. The study concludes that significant hurdles remain in accurately translating spatial language, suggesting that future research should focus on enhancing training datasets, refining models, and developing more sophisticated evaluation metrics.