João Almeida

2024

We present how at Unbabel we have been using Large Language Models to apply a Cultural Transcreation (CT) product on customer support (CS) emails and how we have been testing the quality and potential of this product. We discuss our preliminary evaluation of the performance of different MT models in the task of translating rephrased content and the quality of the translation outputs. Furthermore, we introduce the live pilot programme and the corresponding relevant findings, showing that transcreated content is not only culturally adequate but it is also of high rephrasing and translation quality.

pdf abs
BIT@UA at #SMM4H 2024 Tasks 1 and 5: finding adverse drug events and children’s medical disorders in English tweets
Luis Afonso | João Almeida | Rui Antunes | José Oliveira
Proceedings of The 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks

In this paper we present our proposed systems, for Tasks 1 and 5 of the #SMM4H-2024 shared task (Social Media Mining for Health), responsible for identifying health-related aspects in English social media text. Task 1 consisted of identifying text spans mentioning adverse drug events and linking them to unique identifiers from the medical terminology MedDRA, whereas in Task 5 the aim was to distinguish tweets that report a user having a child with a medical disorder from tweets that merely mention a disorder.For Task 1, our system, composed of a pre-trained RoBERTa model and a random forest classifier, achieved 0.397 and 0.295 entity recognition and normalization F1-scores respectively. In Task 5, we obtained a 0.840 F1-score using a pre-trained BERT model.

Co-authors

Sara Guerreiro de Sousa 1

Malene Sjørslev Søholm 1

João Almeida

2024

Co-authors

Venues