Proceedings of the 1st Workshop on Artificial Intelligence and Easy and Plain Language in Institutional Contexts (AI & EL/PL)

María Isabel Rivas Ginel, Patrick Cadwell, Paolo Canavese, Silvia Hansen-Schirra, Martin Kappus, Anna Matamala, Will Noonan (Editors)

Anthology ID:: 2025.aielpl-1
Month:: June
Year:: 2025
Address:: Geneva, Switzerland
Venue:: AIELPL
SIG:
Publisher:: European Association for Machine Translation
URL:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.aielpl-1/
DOI:
ISBN:: 978-2-9701897-5-6
Bib Export formats:: BibTeX
PDF:: https://preview.aclanthology.org/mtsummit-25-ingestion/2025.aielpl-1.pdf

PDF (full) BibTeX Search

pdf bib abs
Leveraging Large Language Models for Joint Linguistic and Technical Accessibility Improvement: A Case Study on University Webpages
Pierrette Bouillon | Johanna Gerlach | Raphael Rubino

The aim of the study presented in this paper is to investigate whether Large Language Models can be leveraged to translate French content from existing websites into their B1-level simplified versions and to integrate them into an accessible HTML structure. We design a CMS agnostic approach to webpage accessibility improvement based on prompt engineering and apply it to Geneva University webpages. We conduct several automatic and manual evaluations to measure the accessibility improvement reached by several LLMs with various prompts in a zero-shot setting. Results show that LLMs are not all suitable for the task, while a large disparity is observed among results reached by different prompts. Manual evaluation carried out by a dyslexic crowd shows that some LLMs could produce more accessible websites and improve access to information.

pdf bib abs
How Artificial Intelligence can help in the Easy-to-Read Adaptation of Numerical Expressions in Spanish
Mari Carmen Suárez-Figueroa | Alejandro Muñoz-Navarro | Isam Diab

Numerical expressions, specifically the use of fractions and percentages in texts, may encounter a difficulty in the reading comprehension process for different groups of the population, including persons with cognitive disabilities. As an element that facilitates reading comprehension, the Easy-to-Read (E2R) Methodology, created to achieve the so-called cognitive accessibility, recommends avoiding the use of fractions and percentages. If it is necessary to include them, their equivalence or explanation should be described. In order to help people who have difficulties in reading comprehension when they have to deal with fractions and percentages, we have developed an initial method for adapting numerical expressions in an automatic way in Spanish. This method is based on (a) Artificial Intelligence (AI) methods and techniques and (b) the E2R guidelines and recommendations. In addition, the method has been implemented as a web application. With the goal of having our research in the context of the so-called responsible AI, we followed the human-centred design approach called participatory design. In this regard, we involved people with cognitive disabilities in order to (a) reinforce the adaptations provided by E2R experts and included in our method, and (b) evaluate our application to automatically adapt numerical expressions following an E2R approach. Moreover, this method can be integrated into institutional procedures, such as those of university administrations and public organisations, to enhance the accessibility of official documents and educational materials.

pdf bib abs
Large Language Models Applied to Controlled Natural Languages in Communicating Diabetes Therapies
Federica Vezzani | Sara Vecchiato | Elena Frattolin

The aim of this exploratory study is to test the possibility of enhancing the quality of institutional communication related to diabetes self-treatment by switching from manual to prompt-based writing. The study proposes an investigation into the use of prompts applied to controlled natural language, particularly in Italian, French and English. Starting from a corpus of three comparable texts concerning the so-called Rule of 15, a reformulation is undertaken in accordance with the principles of controlled natural languages. Feedback will be gathered through a Likert scale questionnaire and a comprehension test administered to anonymous volunteers.

pdf bib abs
Simplifying Lithuanian text into Easy-to-Read language using large language models
Simona Kuoraitė | Valentas Gružauskas

This paper explores the task of simplifying Lithuanian text into Easy-to-Read language. Easy-to-Read language is a text written in short, clear sentences and simple words, adapted for people with intellectual disabilities or limited language skills. The aim of this work is to investigate how the large language model Lt-Llama-2-7b-hf, pre-trained on Lithuanian language data, can be adapted to the task of simplifying Lithuanian texts into Easy-to-Read language. To achieve this goal, specialized datasets were developed to fine-tune the model, and experiments were carried out. The model was tested by presenting the texts in their original language and the texts with a prompt adapted to the task. The results were evaluated using the SARI metric for assessing the quality of simplified texts and a qualitative evaluation of the large language model. The results show that the fine-tuned model sometimes simplifies text better than a not fine-tuned model, but that a larger and more extensive dataset would be needed to achieve significant results, and that more research should be carried out on fine-tuning the model for this task.

pdf bib abs
ChatGPT and Mistral as a tool for intralingual translation into Easy French
Julia Degenhardt

FALC is a simplified variety of French designed to enhance text comprehensibility and accessibility. Despite its societal benefits, the availability of FALC texts remains limited due to the costly human translation process. This study explores the potential of LLMs, specifically ChatGPT and Mistral, as a tool for automatic intralingual translations. The AI-generated translations of standard French texts on sexual health are compared to human-translated versions. Using a mixed-method approach, the study evaluates content accuracy, readability, and syntactic complexity.

pdf bib abs
Simplifying healthcare communication: Evaluating AI-driven plain language editing of informed consent forms
Vicent Briva-Iglesias | Isabel Peñuelas Gil

Clear communication between patients and healthcare providers is crucial, particularly in informed consent forms (ICFs), which are often written in complex, technical language. This paper explores the effectiveness of generative artificial intelligence (AI) for simplifying ICFs into Plain Language (PL), aiming to enhance patient comprehension and informed decision-making. Using a corpus of 100 cancer-related ICFs, two distinct prompt engineering strategies (Simple AI Edit and Complex AI Edit) were evaluated through readability metrics: Flesch Reading Ease, Gunning Fog Index, and SMOG Index. Statistical analyses revealed statistically significant improvements in readability for AI-simplified texts compared to original documents. Interestingly, the Simple AI Edit strategy consistently outperformed the Complex AI Edit across all metrics. These findings suggest that minimalistic prompt strategies may be optimal, democratizing AI-driven text simplification in healthcare by requiring less expertise and resources. The study underscores the potential for AI to significantly improve patient-provider communication, highlighting future research directions for qualitative assessments and multilingual applications.

pdf bib abs
Translating Easy Language administrative texts: a quantitative analysis of DeepL’s performance from German into Italian using a bilingual corpus
Christiane Maaß | Chiara Fioravanti

This study evaluates the performance of DeepL as an AI-based translation engine, in translating German Easy Language Texts into Italian. The evaluation is based on a corpus of 26 German fact sheets and their Italian human translations. The results show that DeepL’s translations exhibit significant errors in terminology, accuracy, and language conventions. The machine-translated texts often lack consistency in terminology, and the use of technical or unfamiliar words is not adapted to the difficulty level of the target language. Furthermore, the translations tend to normalize the texts towards standard administrative language, making them less accessible. The study highlights the need for human post-editing to ensure both accuracy and suitability of the translated texts. The findings of this study will help identify where to prioritize post-editing efforts and facilitate comparisons with the results obtained from other artificial intelligence tools used for interlingual translation of Easy Language texts in the administrative domain.

pdf bib abs
Do professionally adapted texts follow existing Easy-to-Understand (E2U) language guidelines? A quantitative analysis of two professionally adapted corpora
Andreea Deleanu | Constantin Orăsan | Shenbin Qian | Anastasiia Bezobrazova | Sabine Braun

Easy-to-Understand (E2U) language varieties have been recognized by the UN Convention on the Rights of Persons with Disabilities as a means to prevent communicative exclusion of those facing cognitive barriers and guarantee the fundamental right to Accessible Communication. However, guidance on what it is that makes language ‘easier to understand’ is still fragmented and vague, leading practitioners to rely on their individual expertise. For this reason, this article presents a quantitative corpus analysis to further understand which features of E2U language can more effectively improve verbal comprehension according to professional practice. This is achieved by analysing two parallel corpora of standard and professionally adapted E2U articles to identify adaptation practices implemented according to, in spite of or in addition to official E2U guidelines (Deleanu et al., 2024). The results stemming from the corpus analysis, provide insight into the most effective adaptation strategies that can reduce complexity in verbal discourse. This article will present the methods and results of the corpus analysis.

pdf bib abs
Quantifying word complexity for Leichte Sprache: A computational metric and its psycholinguistic validation
Umesh Patil | Jesus Calvillo | Sol Lago | Anne-Kathrin Schumann

Leichte Sprache (Easy Language or Easy German) is a strongly simplified version of German geared toward a target group with limited language proficiency. In Germany, public bodies are required to provide information in Leichte Sprache. Unfortunately, Leichte Sprache rules are traditionally defined by non-linguists, they are not rooted in linguistic research and they do not provide precise decision criteria or devices for measuring the complexity of linguistic structures (Bock and Pappert,2023). For instance, one of the rules simply recommends the usage of simple rather than complex words. In this paper we, therefore, propose a model to determine word complexity. We train an XGBoost model for classifying word complexity by leveraging word-level linguistic and corpus-level distributional features, frequency information from an in-house Leichte Sprache corpus and human complexity annotations. We psycholinguistically validate our model by showing that it captures human word recognition times above and beyond traditional word-level predictors. Moreover, we discuss a number of practical applications of our classifier, such as the evaluation of AI-simplified text and detection of CEFR levels of words. To our knowledge, this is one of the first attempts to systematically quantify word complexity in the context of Leichte Sprache and to link it directly to real-time word processing.

Several people are excluded from democratic deliberation because the language which is used in this context may be too difficult to understand for them. Our iDEM project aims at lowering existing linguistic barriers in deliberative processes by developing technology to facilitate the translation of complicated text into easy to read formats which are more suitable for may people. In this paper we describe classification experiments for detecting different types of difficulties which should be amended in order to make texts easier to understand. We focus on a lexical simplification system which can achieve state-of-the-art results with the use of a free and open-weight Large Language Model for the Romance Languages in the iDEM project. Moreover, a sentence segmentation system is introduced that can create text segmentation for long sentences based on training data. We describe the iDEM mobile app, which will make our technology available as a service for end-users of our target populations.