This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
AnishKumar
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Machine translation (MT) with Large Language Models (LLMs) holds promise as a clinical translation tool with more capabilities than a traditional MT model. This work compares the quality of English to Spanish translation by three LLMs: ChatGPT3.5 Turbo, ChatGPT4o, and Aguila, against Google Translate. The test set used in this study is MedlinePlus, a parallel dataset of educational health information in English and Spanish developed by the National Library of Medicine. ChatGPT4o and Google Translate performed similarly in both automated scoring (BLEU, METEOR, and BERTscore) and human evaluation with ChatGPT3.5 Turbo not far behind. Aguila, the only LLM intended for primarily Spanish and Catalan use, surprisingly performed much worse than the other models. However, qualitative analysis of Aguila’s results revealed the use of Spanish word choice that may reach a broader audience.
We introduce a purely monolingual approach to filtering for parallel data from a noisy corpus in a low-resource scenario. Our work is inspired by Junczysdowmunt:2018, but we relax the requirements to allow for cases where no parallel data is available. Our primary contribution is a dual monolingual cross-entropy delta criterion modified from Cynical data selection Axelrod:2017, and is competitive (within 1.8 BLEU) with the best bilingual filtering method when used to train SMT systems. Our approach is featherweight, and runs end-to-end on a standard laptop in three hours.