Federica Vezzani

2024

pdf bib
Proceedings of the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024
Giorgio Maria Di Nunzio | Federica Vezzani | Liana Ermakova | Hosein Azarbonyad | Jaap Kamps
Proceedings of the Workshop on DeTermIt! Evaluating Text Difficulty in a Multilingual Context @ LREC-COLING 2024

2023

We present an overview of the Biomedical Translation Task that was part of the Eighth Conference on Machine Translation (WMT23). The aim of the task was the automatic translation of biomedical abstracts from the PubMed database. It included twelve language directions, namely, French, Spanish, Portuguese, Italian, German, and Russian, from and into English. We received submissions from 18 systems and for all the test sets that we released. Our comparison system was based on ChatGPT 3.5 and performed very well in comparison to many of the submissions.

pdf
The Importance of Being Interoperable: Theoretical and Practical Implications in Converting TBX to OntoLex-Lemon
Andrea Bellandi | Giorgio Maria Di Nunzio | Silvia Piccini | Federica Vezzani
Proceedings of the 4th Conference on Language, Data and Knowledge

2022

pdf bib abs
Knowledge Representation and Language Simplification of Human Rights
Sara Silecchia | Federica Vezzani | Giorgio Maria Di Nunzio
Proceedings of the Workshop on Terminology in the 21st century: many faces, many places

In this paper, we propose the description of a very recent interdisciplinary project aiming at analysing both the conceptual and linguistic dimensions of humanitarian rights terminology. This analysis will result in the form of a new knowledge-based multilingual terminological resource which is designed in order to meet the FAIR principles for Open Science and will serve, in the future, as a prototype for the development of a new software for the simplified rewriting of international legal texts relating to human rights. Given the early stage of the project, we will focus on the description of its rationale, the planned workflow, and the theoretical approach which will be adopted to achieve the main goal of this ambitious research project.

In the seventh edition of the WMT Biomedical Task, we addressed a total of seven languagepairs, namely English/German, English/French, English/Spanish, English/Portuguese, English/Chinese, English/Russian, English/Italian. This year’s test sets covered three types of biomedical text genre. In addition to scientific abstracts and terminology items used in previous editions, we released test sets of clinical cases. The evaluation of clinical cases translations were given special attention by involving clinicians in the preparation of reference translations and manual evaluation. For the main MEDLINE test sets, we received a total of 609 submissions from 37 teams. For the ClinSpEn sub-task, we had the participation of five teams.

2021

In the sixth edition of the WMT Biomedical Task, we addressed a total of eight language pairs, namely English/German, English/French, English/Spanish, English/Portuguese, English/Chinese, English/Russian, English/Italian, and English/Basque. Further, our tests were composed of three types of textual test sets. New to this year, we released a test set of summaries of animal experiments, in addition to the test sets of scientific abstracts and terminologies. We received a total of 107 submissions from 15 teams from 6 countries.

2020

pdf abs
On the Formal Standardization of Terminology Resources: The Case Study of TriMED
Federica Vezzani | Giorgio Maria Di Nunzio
Proceedings of the Twelfth Language Resources and Evaluation Conference

The process of standardization plays an important role in the management of terminological resources. In this context, we present the work of re-modeling an existing multilingual terminological database for the medical domain, named TriMED. This resource was conceived in order to tackle some problems related to the complexity of medical terminology and to respond to different users’ needs. We provide a methodology that should be followed in order to make a termbase compliant to the three most recent ISO/TC 37 standards. In particular, we focus on the definition of i) the structural meta-model of the resource, ii) the data categories provided, and iii) the TBX format for its implementation. In addition to the formal standardization of the resource, we describe the realization of a new data category repository for the management of the TriMED terminological data and a Web application that can be used to access the multilingual terminological records.

Machine translation of scientific abstracts and terminologies has the potential to support health professionals and biomedical researchers in some of their activities. In the fifth edition of the WMT Biomedical Task, we addressed a total of eight language pairs. Five language pairs were previously addressed in past editions of the shared task, namely, English/German, English/French, English/Spanish, English/Portuguese, and English/Chinese. Three additional languages pairs were also introduced this year: English/Russian, English/Italian, and English/Basque. The task addressed the evaluation of both scientific abstracts (all language pairs) and terminologies (English/Basque only). We received submissions from a total of 20 teams. For recurring language pairs, we observed an improvement in the translations in terms of automatic scores and qualitative evaluations, compared to previous years.

2018

pdf
TriMED: A Multilingual Terminological Database
Federica Vezzani | Giorgio Maria Di Nunzio | Geneviève Henrot
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

Venues

wmt4
lrec2
term1
ldk1
determit1
show all...

ws1