Sergi Alvarez Vidal

Also published as: Sergi Alvarez-Vidal


2024

pdf
Training an NMT system for legal texts of a low-resource language variety South Tyrolean German - Italian
Antoni Oliver | Sergi Alvarez-Vidal | Egon Stemle | Elena Chiocchetti
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)

This paper illustrates the process of training and evaluating NMT systems for a language pair that includes a low-resource language variety.A parallel corpus of legal texts for Italian and South Tyrolean German has been compiled, with South Tyrolean German being the low-resourced language variety. As the size of the compiled corpus is insufficient for the training, we have combined the corpus with several parallel corpora using data weighting at sentence level. We then performed an evaluation of each combination and of two popular commercial systems.

pdf
LitPC: A set of tools for building parallel corporafrom literary works
Antoni Oliver | Sergi Alvarez-Vidal
Proceedings of the 1st Workshop on Creative-text Translation and Technology

In this paper, we describe the LitPC toolkit, a variety of tools and methods designed for the quick and effective creation of parallel corpora derived from literary works. This toolkit can be a useful resource due to the scarcity of curated parallel texts for this domain. We also feature a case study describing the creation of a Russian-English parallel corpus based on the literary works by Leo Tolstoy. Furthermore, an augmented version of this corpus is used to both train and assess neural machine translation systems specifically adapted to the author’s style.

2023

pdf bib
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
Mary Nurminen | Judith Brenner | Maarit Koponen | Sirkku Latomaa | Mikhail Mikhailov | Frederike Schierl | Tharindu Ranasinghe | Eva Vanmassenhove | Sergi Alvarez Vidal | Nora Aranberri | Mara Nunziatini | Carla Parra Escartín | Mikel Forcada | Maja Popovic | Carolina Scarton | Helena Moniz
Proceedings of the 24th Annual Conference of the European Association for Machine Translation