Jose Cols

2025

SUWMIT at BioLaySumm2025: Instruction-based Summarization with Contrastive Decoding
Priyam Basu | Jose Cols | Daniel Jarvis | Yongsin Park | Daniel Rodabaugh
Proceedings of the 24th Workshop on Biomedical Language Processing (Shared Tasks)

2024

pdf bib abs

Spanish Corpus and Provenance with Computer-Aided Translation for the WMT24 OLDI Shared Task
Jose Cols
Proceedings of the Ninth Conference on Machine Translation

This paper presents the Seed-CAT submission to the WMT24 Open Language Data Initiative shared task. We detail our data collection method, which involves a computer-aided translation tool developed explicitly for translating Seed corpora. We release a professionally translated Spanish corpus and a provenance dataset documenting the translation process. The quality of the data was validated on the FLORES+ benchmark with English-Spanish neural machine translation models, achieving an average chrF++ score of 34.9.

Co-authors

Priyam Basu 1
Daniel Jarvis 1
Yongsin Park 1
Daniel Rodabaugh 1

Venues

ws2
bionlp1
wmt1

Fix author