Federico Garcea


2025

Increasing attention is being dedicated by the NLP community to gender-fair practices, including emerging forms of non-binary language. Given the shift to the prompting paradigm for multiple tasks, direct comparisons between prompted and fine-tuned models in this context are lacking. We aim to fill this gap by comparing prompt engineering and fine-tuning techniques for gender-fair rewriting in Italian. We do so by framing a rewriting task where Italian gender-marked translations from English gender-ambiguous sentences are adapted into a gender-neutral alternative using direct non-binary language. We augment existing datasets with gender-neutral translations and conduct experiments to determine the best architecture and approach to complete such task, by fine-tuning and prompting seq2seq encoder-decoder and autoregressive decoder-only models. We show that smaller seq2seq models can reach good performance when fine-tuned, even with relatively little data; when it comes to prompts, including task demonstrations is crucial, and we find that chat-tuned models reach the best results in a few-shot setting. We achieve promising results, especially in contexts of limited data and resources.

2023

We approach the task of assessing the suitability of a source text for translation by transferring the knowledge from established MT evaluation metrics to a model able to predict MT quality a priori from the source text alone. To open the door to experiments in this regard, we depart from reference English-German parallel corpora to build a corpus of 14,253 source text-quality score tuples. The tuples include four state-of-the-art metrics: cushLEPOR, BERTScore, COMET, and TransQuest. With this new resource at hand, we fine-tune XLM-RoBERTa, both in a single-task and a multi-task setting, to predict these evaluation scores from the source text alone. Results for this methodology are promising, with the single-task model able to approximate well-established MT evaluation and quality estimation metrics - without looking at the actual machine translations - achieving low RMSE values in the [0.1-0.2] range and Pearson correlation scores up to 0.688.
In the domain of cuisine, both dishes and ingredients tend to be heavily rooted in the local context they belong to. As a result, the associated terms are often realia tied to specific cultures and languages. This causes difficulties for non-speakers of the local language and ma- chine translation (MT) systems alike, as it implies a lack of the concept and/or of a plausible translation. MT typically opts for one of two alternatives: keeping the source language terms untranslated or relying on a hyperonym/near-synonym in the target language, provided one exists. !Translate proposes a better alternative: explaining. Given a cuisine entry such as a restaurant menu item, we identify culture-specific terms and enrich the output of the MT system with automatically retrieved definitions of the non-translatable terms in the target language, making the translation more actionable for the final user.

2020

2013