VerbaNexAI at SemEval-2025 Task 2: Enhancing Entity-Aware Translation with Wikidata-Enriched MarianMT

Daniel Peña Gnecco, Juan Carlos Martinez Santos, Edwin Puertas


Abstract
This paper presents the VerbaNexAi Lab system for SemEval-2025 Task 2: Entity-Aware Machine Translation (EA-MT), focusing on translating named entities from English to Spanish across categories such as musical works, foods, and landmarks. Our approach integrates detailed data preprocessing, enrichment with 240,432 Wikidata entity pairs, and fine-tuning of the MarianMT model to enhance entity translation accuracy. Official results reveal a COMET score of 87.09, indicating high fluency, an M-ETA score of 24.62, highlighting challenges in entity precision, and an Overall Score of 38.38, ranking last among 34 systems. While Wikidata improved translations for common entities like “Águila de San Juan,” our static methodology underperformed compared to dynamic LLM-based approaches.
Anthology ID:
2025.semeval-1.167
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1255–1262
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.167/
DOI:
Bibkey:
Cite (ACL):
Daniel Peña Gnecco, Juan Carlos Martinez Santos, and Edwin Puertas. 2025. VerbaNexAI at SemEval-2025 Task 2: Enhancing Entity-Aware Translation with Wikidata-Enriched MarianMT. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1255–1262, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
VerbaNexAI at SemEval-2025 Task 2: Enhancing Entity-Aware Translation with Wikidata-Enriched MarianMT (Peña Gnecco et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.167.pdf