Johanna Monti

2024

pdf abs
Riddle Me This: Evaluating Large Language Models in Solving Word-Based Games
Raffaele Manna | Maria Pia di Buono | Johanna Monti
Proceedings of the 10th Workshop on Games and Natural Language Processing @ LREC-COLING 2024

In this contribution, we examine the proficiency of Large Language Models (LLMs) in solving the linguistic game “La Ghigliottina,” the final game of the popular Italian TV quiz show “L’Eredità”. This game is particularly challenging as it requires LLMs to engage in semantic inference reasoning for identifying the solutions of the game. Our experiment draws inspiration from Ghigliottin-AI, a task of EVALITA 2020, an evaluation campaign focusing on Natural Language Processing (NLP) and speech tools designed for the Italian language. To benchmark our experiment, we use the results of the most successful artificial player in this task, namely Il Mago della Ghigliottina. The paper describes the experimental setting and the results which show that LLMs perform poorly.

pdf abs
Large Language Models as Legal Translators of Arabic Legislatives: Does ChatGPT and Gemini Care for Context and Terminology?
Khadija ElFqih | Johanna Monti
Proceedings of The Second Arabic Natural Language Processing Conference

Accurate translation of terminology and adaptation to in-context information is a pillar to high quality translation. Recently, there is a remarkable interest towards the use and the evaluation of Large Language Models (LLMs) particularly for Machine Translation tasks. Nevertheless, despite their recent advancement and ability to understand and generate human-like language, these LLMs are still far from perfect, especially in domain-specific scenarios, and need to be thoroughly investigated. This is particularly evident in automatically translating legal terminology from Arabic into English and French, where, beyond the inherent complexities of legal language and specialised translations, technical limitations of LLMs further hinder accurate generation of text. In this paper, we present a preliminary evaluation of two evolving LLMs, namely GPT-4 Generative Pre-trained Transformer and Gemini, as legal translators of Arabic legislatives to test their accuracy and the extent to which they care for context and terminology across two language pairs (AR→EN / AR→FR). The study targets the evaluation of Zero-Shot prompting for in-context and out-of-context scenarios of both models relying on a gold standard dataset, verified by professional translators who are also experts in the field. We evaluate the results applying the Multidimensional Quality Metrics to classify translation errors. Moreover, we also evaluate the general LLMs outputs to verify their correctness, consistency, and completeness.In general, our results show that the models are far from perfect and recall for more fine-tuning efforts using specialised terminological data in the legal domain from Arabic into English and French.

pdf abs
Prompting Large Language Models for Idiomatic Translation
Antonio Castaldo | Johanna Monti
Proceedings of the 1st Workshop on Creative-text Translation and Technology

Large Language Models (LLMs) have demonstrated impressive performance in translating content across different languages and genres. Yet, their potential in the creative aspects of machine translation has not been fully explored. In this paper, we seek to identify the strengths and weaknesses inherent in different LLMs when applied to one of the most prominent features of creative works: the translation of idiomatic expressions. We present an overview of their performance in the EN→IT language pair, a context characterized by an evident lack of bilingual data tailored for idiomatic translation. Lastly, we investigate the impact of prompt design on the quality of machine translation, drawing on recent findings which indicate a substantial variation in the performance of LLMs depending on the prompts utilized.

2023

We present version 1.3 of the PARSEME multilingual corpus annotated with verbal multiword expressions. Since the previous version, new languages have joined the undertaking of creating such a resource, some of the already existing corpora have been enriched with new annotated texts, while others have been enhanced in various ways. The PARSEME multilingual corpus represents 26 languages now. All monolingual corpora therein use Universal Dependencies v.2 tagset. They are (re-)split observing the PARSEME v.1.2 standard, which puts impact on unseen VMWEs. With the current iteration, the corpus release process has been detached from shared tasks; instead, a process for continuous improvement and systematic releases has been introduced.

pdf abs
On the Evaluation of Terminology Translation Errors in NMT and PB-SMT in the Legal Domain: a Study on the Translation of Arabic Legal Documents into English and French
Khadija Ait ElFqih | Johanna Monti
Proceedings of the Workshop on Computational Terminology in NLP and Translation Studies (ConTeNTS) Incorporating the 16th Workshop on Building and Using Comparable Corpora (BUCC)

In the translation process, terminological resources are used to solve translation problems, so information on terminological equivalence is crucial to make the most appropriate choices in terms of translation equivalence. In the context of Machine translation, indeed, neural models have improved the state-of-the-art in Machine Translation considerably in recent years. However, they still underperform in domain-specific fields and in under-resourced languages. This is particularly evident in translating legal terminology for Arabic, where current Machine Translation outputs do not adhere to the contextual, linguistic, cultural, and terminological constraints posed by translating legal terms in Arabic. In this paper, we conduct a comparative qualitative evaluation and comprehensive error analysis on legal terminology translation in Phrase-Based Statistical Machine Translation and Neural Machine Translation in two translation language pairs: Arabic-English and Arabic-French. We propose an error typology taking the legal terminology translation from Arabic into account. We demonstrate our findings, highlighting the strengths and weaknesses of both approaches in the area of legal terminology translation for Arabic. We also introduce a multilingual gold standard dataset that we developed using our Arabic legal corpus. This dataset serves as a reliable benchmark and/or reference during the evaluation process to decide the degree of adequacy and fluency of the Phrase-Based Statistical Machine Translation and Neural Machine Translation systems.

pdf
Formalizing Translation Equivalence and Lexico-Semantic Relations Among Terms in a Bilingual Terminological Resource
Giulia Speranza | Maria Pia Di Buono | Johanna Monti
Proceedings of the 4th Conference on Language, Data and Knowledge

2022

pdf abs
Assessing the Quality of an Italian Crowdsourced Idiom Corpus:the Dodiom Experiment
Giuseppina Morza | Raffaele Manna | Johanna Monti
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper describes how idiom-related language resources, collected through a crowdsourcing experiment carried out by means of Dodiom, a Game-with-a-purpose, have been analysed by language experts. The paper focuses on the criteria adopted for the data annotation and evaluation process. The main scope of this project is, indeed, the evaluation of the quality of the linguistic data obtained through a crowdsourcing project, namely to assess if the data provided and evaluated by the players who joined the game are actually considered of good quality by the language experts. Finally, results of the annotation and evaluation processes as well as future work are presented.

pdf bib
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
Johanna Monti | Valerio Basile | Maria Pia Di Buono | Raffaele Manna | Antonio Pascucci | Sara Tonelli
Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis

2021

pdf bib abs
gENder-IT: An Annotated English-Italian Parallel Challenge Set for Cross-Linguistic Natural Gender Phenomena
Eva Vanmassenhove | Johanna Monti
Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing

Languages differ in terms of the absence or presence of gender features, the number of gender classes and whether and where gender features are explicitly marked. These cross-linguistic differences can lead to ambiguities that are difficult to resolve, especially for sentence-level MT systems. The identification of ambiguity and its subsequent resolution is a challenging task for which currently there aren’t any specific resources or challenge sets available. In this paper, we introduce gENder-IT, an English–Italian challenge set focusing on the resolution of natural gender phenomena by providing word-level gender tags on the English source side and multiple gender alternative translations, where needed, on the Italian target side.

2020

pdf bib
Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language
Johanna Monti | Valerio Basile | Maria Pia Di Buono | Raffaele Manna | Antonio Pascucci | Sara Tonelli
Proceedings of the Workshop on Resources and Techniques for User and Author Profiling in Abusive Language

pdf abs
The Challenge of the TV game La Ghigliottina to NLP
Federico Sangati | Antonio Pascucci | Johanna Monti
Workshop on Games and Natural Language Processing

In this paper, we describe a Telegram bot, Mago della Ghigliottina (Ghigliottina Wizard), able to solve La Ghigliottina game (The Guillotine), the final game of the Italian TV quiz show L’Eredità. Our system relies on linguistic resources and artificial intelligence and achieves better results than human players (and competitors of L’Eredità too). In addition to solving a game, Mago della Ghigliottina can also generate new game instances and challenge the users to match the solution.

pdf abs
The Role of Computational Stylometry in Identifying (Misogynistic) Aggression in English Social Media Texts
Antonio Pascucci | Raffaele Manna | Vincenzo Masucci | Johanna Monti
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying

In this paper, we describe UniOr_ExpSys team participation in TRAC-2 (Trolling, Aggression and Cyberbullying) shared task, a workshop organized as part of LREC 2020. TRAC-2 shared task is organized in two sub-tasks: Aggression Identification (a 3-way classification between “Overtly Aggressive”, “Covertly Aggressive” and “Non-aggressive” text data) and Misogynistic Aggression Identification (a binary classifier for classifying the texts as “gendered” or “non-gendered”). Our approach is based on linguistic rules, stylistic features extraction through stylometric analysis and Sequential Minimal Optimization algorithm in building the two classifiers.

pdf abs
From Linguistic Resources to Ontology-Aware Terminologies: Minding the Representation Gap
Giulia Speranza | Maria Pia di Buono | Johanna Monti | Federico Sangati
Proceedings of the Twelfth Language Resources and Evaluation Conference

Terminological resources have proven crucial in many applications ranging from Computer-Aided Translation tools to authoring softwares and multilingual and cross-lingual information retrieval systems. Nonetheless, with the exception of a few felicitous examples, such as the IATE (Interactive Terminology for Europe) Termbank, many terminological resources are not available in standard formats, such as Term Base eXchange (TBX), thus preventing their sharing and reuse. Yet, these terminologies could be improved associating the correspondent ontology-based information. The research described in the present contribution demonstrates the process and the methodologies adopted in the automatic conversion into TBX of such type of resources, together with their semantic enrichment based on the formalization of ontological information into terminologies. We present a proof-of-concept using the Italian Linguistic Resource for the Archaeological domain (developed according to Thesauri and Guidelines of the Italian Central Institute for the Catalogue and Documentation). Further, we introduce the conversion tool developed to support the process of creating ontology-aware terminologies for improving interoperability and sharing of existing language technologies and data sets.

A Case Study of Natural Gender Phenomena in Translation: A Comparison of Google Translate, Bing Microsoft Translator and DeepL for English to Italian, French and Spanish
Argentina Anna Rescigno | Johanna Monti | Andy Way | Eva Vanmassenhove
Workshop on the Impact of Machine Translation (iMpacT 2020)

pdf abs
UNIOR NLP at MWSA Task - GlobaLex 2020: Siamese LSTM with Attention for Word Sense Alignment
Raffaele Manna | Giulia Speranza | Maria Pia di Buono | Johanna Monti
Proceedings of the 2020 Globalex Workshop on Linked Lexicography

In this paper we describe the system submitted to the ELEXIS Monolingual Word Sense Alignment Task. We test different systems,which are two types of LSTMs and a system based on a pretrained Bidirectional Encoder Representations from Transformers (BERT)model, to solve the task. LSTM models use fastText pre-trained word vectors features with different settings. For training the models,we did not combine external data with the dataset provided for the task. We select a sub-set of languages among the proposed ones,namely a set of Romance languages, i.e., Italian, Spanish, Portuguese, together with English and Dutch. The Siamese LSTM withattention and PoS tagging (LSTM-A) performed better than the other two systems, achieving a 5-Class Accuracy score of 0.844 in theOverall Results, ranking the first position among five teams.

We present edition 1.2 of the PARSEME shared task on identification of verbal multiword expressions (VMWEs). Lessons learned from previous editions indicate that VMWEs have low ambiguity, and that the major challenge lies in identifying test instances never seen in the training data. Therefore, this edition focuses on unseen VMWEs. We have split annotated corpora so that the test corpora contain around 300 unseen VMWEs, and we provide non-annotated raw corpora to be used by complementary discovery methods. We released annotated and raw corpora in 14 languages, and this semi-supervised challenge attracted 7 teams who submitted 9 system results. This paper describes the effort of corpus creation, the task design, and the results obtained by the participating systems, especially their performance on unseen expressions.

pdf abs
Is this hotel review truthful or deceptive? A platform for disinformation detection through computational stylometry
Antonio Pascucci | Raffaele Manna | Ciro Caterino | Vincenzo Masucci | Johanna Monti
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management

In this paper, we present a web service platform for disinformation detection in hotel reviews written in English. The platform relies on a hybrid approach of computational stylometry techniques, machine learning and linguistic rules written using COGITO, Expert System Corp.’s semantic intelligence software thanks to which it is possible to analyze texts and extract all their characteristics. We carried out a research experiment on the Deceptive Opinion Spam corpus, a balanced corpus composed of 1,600 hotel reviews of 20 Chicago hotels split into four datasets: positive truthful, negative truthful, positive deceptive and negative deceptive reviews. We investigated four different classifiers and we detected that Simple Logistic is the most performing algorithm for this type of classification.

2018

This paper describes the PARSEME Shared Task 1.1 on automatic identification of verbal multiword expressions. We present the annotation methodology, focusing on changes from last year’s shared task. Novel aspects include enhanced annotation guidelines, additional annotated data for most languages, corpora for some new languages, and new evaluation settings. Corpora were created for 20 languages, which are also briefly discussed. We report organizational principles behind the shared task and the evaluation metrics employed for ranking. The 17 participating systems, their methods and obtained results are also presented and analysed.

2017

Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word boundaries that are both idiosyncratic and pervasive across different languages. The structure of linguistic processing that depends on the clear distinction between words and phrases has to be re-thought to accommodate MWEs. The issue of MWE handling is crucial for NLP applications, where it raises a number of challenges. The emergence of solutions in the absence of guiding principles motivates this survey, whose aim is not only to provide a focused review of MWE processing, but also to clarify the nature of interactions between MWE processing and downstream applications. We propose a conceptual framework within which challenges and research contributions can be positioned. It offers a shared understanding of what is meant by “MWE processing,” distinguishing the subtasks of MWE discovery and identification. It also elucidates the interactions between MWE processing and two use cases: Parsing and machine translation. Many of the approaches in the literature can be differentiated according to how MWE processing is timed with respect to underlying use cases. We discuss how such orchestration choices affect the scope of MWE-aware systems. For each of the two MWE processing subtasks and for each of the two use cases, we conclude on open issues and research perspectives.

2016

This paper summarizes the preliminary results of an ongoing survey on multiword resources carried out within the IC1207 Cost Action PARSEME (PARSing and Multi-word Expressions). Despite the availability of language resource catalogs and the inventory of multiword datasets on the SIGLEX-MWE website, multiword resources are scattered and difficult to find. In many cases, language resources such as corpora, treebanks, or lexical databases include multiwords as part of their data or take them into account in their annotations. However, these resources need to be centralized to make them accessible. The aim of this survey is to create a portal where researchers can easily find multiword(-aware) language resources for their research. We report on the design of the survey and analyze the data gathered so far. We also discuss the problems we have detected upon examination of the data as well as possible ways of enhancing the survey.

2014

This paper presents a systematic human evaluation of translations of English support verb constructions produced by a rule-based machine translation (RBMT) system (OpenLogos) and a statistical machine translation (SMT) system (Google Translate) for five languages: French, German, Italian, Portuguese and Spanish. We classify support verb constructions by means of their syntactic structure and semantic behavior and present a qualitative analysis of their translation errors. The study aims to verify how machine translation (MT) systems translate fine-grained linguistic phenomena, and how well-equipped they are to produce high-quality translation. Another goal of the linguistically motivated quality analysis of SVC raw output is to reinforce the need for better system hybridization, which leverages the strengths of RBMT to the benefit of SMT, especially in improving the translation of multiword units. Taking multiword units into account, we propose an effective method to achieve MT hybridization based on the integration of semantico-syntactic knowledge into SMT.

2013

pdf
Cross-Lingual Information Retrieval and Semantic Interoperability for Cultural Heritage Repositories
Johanna Monti | Mario Monteleone | Maria Pia di Buono | Federica Marano
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf
When multiwords go bad in machine translation
Anabela Barreiro | Johanna Monti | Brigitte Orliac | Fernando Batista
Proceedings of the Workshop on Multi-word Units in Machine Translation and Translation Technologies

pdf
Multi-word processing in an ontology-based cross-language information retrieval model for specific domain collections
Maria Pia di Buono | Johanna Monti | Mario Monteleone | Federica Marano
Proceedings of the Workshop on Multi-word Units in Machine Translation and Translation Technologies

2011

pdf bib
In search of knowledge: text mining dedicated to technical translation
Johanna Monti | Annibale Elia | Alberto Postiglione | Maria Monteleone | Federica Marano
Proceedings of Translating and the Computer 33

pdf abs
Taking on new challenges in multi-word unit processing for machine translation
Johanna Monti | Anabela Barreiro | Annibale Elia | Federica Marano | Antonella Napoli
Proceedings of the Second International Workshop on Free/Open-Source Rule-Based Machine Translation

This paper discusses the qualitative comparative evaluation performed on the results of two machine translation systems with different approaches to the processing of multi-word units. It proposes a solution for overcoming the difficulties multi-word units present to machine translation by adopting a methodology that combines the lexicon grammar approach with OpenLogos ontology and semantico-syntactic rules. The paper also discusses the importance of a qualitative evaluation metrics to correctly evaluate the performance of machine translation engines with regards to multi-word units.