Towards Content Accessibility Through Lexical Simplification for Maltese as a Low-Resource Language

Martina Meli, Marc Tanti, Chris Porter


Abstract
Natural Language Processing techniques have been developed to assist in simplifying online content while preserving meaning. However, for low-resource languages, like Maltese, there are still numerous challenges and limitations. Lexical Simplification (LS) is a core technique typically adopted to improve content accessibility, and has been widely studied for high-resource languages such as English and French. Motivated by the need to improve access to Maltese content and the limitations in this context, this work set out to develop and evaluate an LS system for Maltese text. An LS pipeline was developed consisting of (1) potential complex word identification, (2) substitute generation, (3) substitute selection, and (4) substitute ranking. An evaluation data set was developed to assess the performance of each step. Results are encouraging and will lead to numerous future work. Finally, a single-blind study was carried out with over 200 participants, where the system’s perceived quality in text simplification was evaluated. Results suggest that meaning is retained about 50% of the time, and when meaning is retained, about 70% of system-generated sentences are either perceived as simpler or of equal simplicity to the original. Challenges remain, and this study proposes a number of areas that may benefit from further research.
Anthology ID:
2024.ltedi-1.5
Volume:
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:
March
Year:
2024
Address:
St. Julian's, Malta
Editors:
Bharathi Raja Chakravarthi, Bharathi B, Paul Buitelaar, Thenmozhi Durairaj, György Kovács, Miguel Ángel García Cumbreras
Venues:
LTEDI | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
41–51
Language:
URL:
https://aclanthology.org/2024.ltedi-1.5
DOI:
Bibkey:
Cite (ACL):
Martina Meli, Marc Tanti, and Chris Porter. 2024. Towards Content Accessibility Through Lexical Simplification for Maltese as a Low-Resource Language. In Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 41–51, St. Julian's, Malta. Association for Computational Linguistics.
Cite (Informal):
Towards Content Accessibility Through Lexical Simplification for Maltese as a Low-Resource Language (Meli et al., LTEDI-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2024.ltedi-1.5.pdf
Video:
 https://preview.aclanthology.org/naacl24-info/2024.ltedi-1.5.mp4