2024
pdf
abs
Uncertainty Estimation in Large Language Models to Support Biodiversity Conservation
Maria Mora-Cross
|
Saul Calderon-Ramirez
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)
Large Language Models (LLM) provide significant value in question answering (QA) scenarios and have practical application in complex decision-making contexts, such as biodiversity conservation. However, despite substantial performance improvements, they may still produce inaccurate outcomes. Consequently, incorporating uncertainty quantification alongside predictions is essential for mitigating the potential risks associated with their use. This study introduces an exploratory analysis of the application of Monte Carlo Dropout (MCD) and Expected Calibration Error (ECE) to assess the uncertainty of generative language models. To that end, we analyzed two publicly available language models (Falcon-7B and DistilGPT-2). Our findings suggest the viability of employing ECE as a metric to estimate uncertainty in generative LLM. The findings from this research contribute to a broader project aiming at facilitating free and open access to standardized and integrated data and services about Costa Rica’s biodiversity to support the development of science, education, and biodiversity conservation.
pdf
abs
An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework
Matthew Shardlow
|
Fernando Alva-Manchego
|
Riza Batista-Navarro
|
Stefan Bott
|
Saul Calderon Ramirez
|
Rémi Cardon
|
Thomas François
|
Akio Hayakawa
|
Andrea Horbach
|
Anna Hülsing
|
Yusuke Ide
|
Joseph Marvin Imperial
|
Adam Nohejl
|
Kai North
|
Laura Occhipinti
|
Nelson Peréz Rojas
|
Nishat Raihan
|
Tharindu Ranasinghe
|
Martin Solis Salazar
|
Marcos Zampieri
|
Horacio Saggion
Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024
We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwise difficult texts in their native, often low-resourced, languages.
pdf
abs
The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline
Matthew Shardlow
|
Fernando Alva-Manchego
|
Riza Batista-Navarro
|
Stefan Bott
|
Saul Calderon Ramirez
|
Rémi Cardon
|
Thomas François
|
Akio Hayakawa
|
Andrea Horbach
|
Anna Hülsing
|
Yusuke Ide
|
Joseph Marvin Imperial
|
Adam Nohejl
|
Kai North
|
Laura Occhipinti
|
Nelson Peréz Rojas
|
Nishat Raihan
|
Tharindu Ranasinghe
|
Martin Solis Salazar
|
Sanja Štajner
|
Marcos Zampieri
|
Horacio Saggion
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
We report the findings of the 2024 Multilingual Lexical Simplification Pipeline shared task. We released a new dataset comprising 5,927 instances of lexical complexity prediction and lexical simplification on common contexts across 10 languages, split into trial (300) and test (5,627). 10 teams participated across 2 tracks and 10 languages with 233 runs evaluated across all systems. Five teams participated in all languages for the lexical complexity prediction task and 4 teams participated in all languages for the lexical simplification task. Teams employed a range of strategies, making use of open and closed source large language models for lexical simplification, as well as feature-based approaches for lexical complexity prediction. The highest scoring team on the combined multilingual data was able to obtain a Pearson’s correlation of 0.6241 and an ACC@1@Top1 of 0.3772, both demonstrating that there is still room for improvement on two difficult sub-tasks of the lexical simplification pipeline.