Juan Pablo Martínez
Also published as: Juan Pablo Martínez Cortés
2024
Expanding the FLORES+ Multilingual Benchmark with Translations for Aragonese, Aranese, Asturian, and Valencian
Juan Antonio Perez-Ortiz | Felipe Sánchez-Martínez | Víctor M. Sánchez-Cartagena | Miquel Esplà-Gomis | Aaron Galiano Jimenez | Antoni Oliver | Claudi Aventín-Boya | Alejandro Pardos | Cristina Valdés | Jusèp Loís Sans Socasau | Juan Pablo Martínez
Proceedings of the Ninth Conference on Machine Translation
Juan Antonio Perez-Ortiz | Felipe Sánchez-Martínez | Víctor M. Sánchez-Cartagena | Miquel Esplà-Gomis | Aaron Galiano Jimenez | Antoni Oliver | Claudi Aventín-Boya | Alejandro Pardos | Cristina Valdés | Jusèp Loís Sans Socasau | Juan Pablo Martínez
Proceedings of the Ninth Conference on Machine Translation
In this paper, we describe the process of creating the FLORES+ datasets for several Romance languages spoken in Spain, namely Aragonese, Aranese, Asturian, and Valencian. The Aragonese and Aranese datasets are entirely new additions to the FLORES+ multilingual benchmark. An initial version of the Asturian dataset was already available in FLORES+, and our work focused on a thorough revision. Similarly, FLORES+ included a Catalan dataset, which we adapted to the Valencian variety spoken in the Valencian Community. The development of the Aragonese, Aranese, and revised Asturian FLORES+ datasets was undertaken as part of a WMT24 shared task on translation into low-resource languages of Spain.
2012
Free/Open Source Shallow-Transfer Based Machine Translation for Spanish and Aragonese
Juan Pablo Martínez Cortés | Jim O’Regan | Francis Tyers
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Juan Pablo Martínez Cortés | Jim O’Regan | Francis Tyers
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This article describes the development of a bidirectional shallow-transfer based machine translation system for Spanish and Aragonese, based on the Apertium platform, reusing the resources provided by other translators built for the platform. The system, and the morphological analyser built for it, are both the first resources of their kind for Aragonese. The morphological analyser has coverage of over 80\%, and is being reused to create a spelling checker for Aragonese. The translator is bidirectional: the Word Error Rate for Spanish to Aragonese is 16.83%, while Aragonese to Spanish is 11.61%.