Abstract
In this paper, we report on some experiments aimed at exploring the relation between document-level and sentence-level readability assessment for French. These were run on an open-source tailored corpus, which was automatically created by aggregating various sources from children’s literature. On top of providing the research community with a freely available corpus, we report on sentence readability scores obtained when applying both classical approaches (aka readability formulas) and state-of-the-art deep learning techniques (e.g. fine-tuning of large language models). Results show a relatively strong correlation between document-level and sentence-level readability, suggesting ways to reduce the cost of building annotated sentence-level readability datasets.- Anthology ID:
- 2023.tsar-1.8
- Volume:
- Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability
- Month:
- September
- Year:
- 2023
- Address:
- Varna, Bulgaria
- Editors:
- Sanja Štajner, Horacio Saggio, Matthew Shardlow, Fernando Alva-Manchego
- Venues:
- TSAR | WS
- SIG:
- Publisher:
- INCOMA Ltd., Shoumen, Bulgaria
- Note:
- Pages:
- 78–84
- Language:
- URL:
- https://aclanthology.org/2023.tsar-1.8
- DOI:
- Cite (ACL):
- Duy Van Ngo and Yannick Parmentier. 2023. Towards Sentence-level Text Readability Assessment for French. In Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability, pages 78–84, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
- Cite (Informal):
- Towards Sentence-level Text Readability Assessment for French (Ngo & Parmentier, TSAR-WS 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.tsar-1.8.pdf