Neural Readability Pairwise Ranking for Sentences in Italian Administrative Language

Martina Miliani; Serena Auriemma; Fernando Alva-Manchego; Alessandro Lenci

Neural Readability Pairwise Ranking for Sentences in Italian Administrative Language

Martina Miliani, Serena Auriemma, Fernando Alva-Manchego, Alessandro Lenci

Abstract

Automatic Readability Assessment aims at assigning a complexity level to a given text, which could help improve the accessibility to information in specific domains, such as the administrative one. In this paper, we investigate the behavior of a Neural Pairwise Ranking Model (NPRM) for sentence-level readability assessment of Italian administrative texts. To deal with data scarcity, we experiment with cross-lingual, cross- and in-domain approaches, and test our models on Admin-It, a new parallel corpus in the Italian administrative language, containing sentences simplified using three different rewriting strategies. We show that NPRMs are effective in zero-shot scenarios (~0.78 ranking accuracy), especially with ranking pairs containing simplifications produced by overall rewriting at the sentence-level, and that the best results are obtained by adding in-domain data (achieving perfect performance for such sentence pairs). Finally, we investigate where NPRMs failed, showing that the characteristics of the training data, rather than its size, have a bigger effect on a model’s performance.

Anthology ID:: 2022.aacl-main.63
Volume:: Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: November
Year:: 2022
Address:: Online only
Editors:: Yulan He, Heng Ji, Sujian Li, Yang Liu, Chua-Hui Chang
Venues:: AACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 849–866
Language:
URL:: https://aclanthology.org/2022.aacl-main.63
DOI:
Bibkey:
Cite (ACL):: Martina Miliani, Serena Auriemma, Fernando Alva-Manchego, and Alessandro Lenci. 2022. Neural Readability Pairwise Ranking for Sentences in Italian Administrative Language. In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 849–866, Online only. Association for Computational Linguistics.
Cite (Informal):: Neural Readability Pairwise Ranking for Sentences in Italian Administrative Language (Miliani et al., AACL-IJCNLP 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/emnlp-22-attachments/2022.aacl-main.63.pdf

PDF Search