Mark Andrade
2025
eSTÓR: Curating Irish Datasets for Machine Translation
Abigail Walsh
|
Órla Ní Loinsigh
|
Jane Adkins
|
Ornait O’Connell
|
Mark Andrade
|
Teresa Clifford
|
Federico Gaspari
|
Jane Dunne
|
Brian Davis
Proceedings of Machine Translation Summit XX: Volume 2
Minority languages such as Irish are massively under-resourced, particularly in terms of high-quality domain-relevant data, limiting the capabilities of machine translation (MT) engines, even those integrating large language models (LLMs). The eSTÓR project, described in this paper, focuses on the collection and curation of high-quality Irish text data for diverse domains.
Search
Fix author
Co-authors
- Jane Adkins 1
- Teresa Clifford 1
- Brian Davis 1
- Jane Dunne 1
- Federico Gaspari 1
- show all...