Predicting Prosodic Boundaries for Children’s Texts

Mansi Dhamne, Sneha Raman, Preeti Rao


Abstract
Reading fluency in any language requires accurate word decoding but also natural prosodic phrasing i.e the grouping of words into rhythmically and syntactically coherent units. This holds for, both, reading aloud and silent reading. While adults pause meaningfully at clause or punctuation boundaries, children aged 8-13 often insert inappropriate pauses due to limited breath control and underdeveloped prosodic awareness. We present a text-based model to predict cognitively appropriate pause locations in children’s reading material. Using a curated dataset of 54 leveled English stories annotated for potential pauses, or prosodic boundaries, by 21 fluent speakers, we find that nearly 30% of pauses occur at non-punctuation locations of the text, highlighting the limitations of using only punctuation-based cues. Our model combines lexical, syntactic, and contextual features with a novel breath duration feature that captures syllable load since the last major boundary. This cognitively motivated approach can model both allowed and “forbidden” pauses. The proposed framework supports applications such as child-directed TTS and oral reading fluency assessment where the proper grouping of words is considered critical to reading comprehension.
Anthology ID:
2025.emnlp-main.1623
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
31863–31873
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1623/
DOI:
Bibkey:
Cite (ACL):
Mansi Dhamne, Sneha Raman, and Preeti Rao. 2025. Predicting Prosodic Boundaries for Children’s Texts. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 31863–31873, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Predicting Prosodic Boundaries for Children’s Texts (Dhamne et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1623.pdf
Checklist:
 2025.emnlp-main.1623.checklist.pdf