Sneha Raman


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Predicting Prosodic Boundaries for Children’s Texts
Mansi Dhamne | Sneha Raman | Preeti Rao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Reading fluency in any language requires accurate word decoding but also natural prosodic phrasing i.e the grouping of words into rhythmically and syntactically coherent units. This holds for, both, reading aloud and silent reading. While adults pause meaningfully at clause or punctuation boundaries, children aged 8-13 often insert inappropriate pauses due to limited breath control and underdeveloped prosodic awareness. We present a text-based model to predict cognitively appropriate pause locations in children’s reading material. Using a curated dataset of 54 leveled English stories annotated for potential pauses, or prosodic boundaries, by 21 fluent speakers, we find that nearly 30% of pauses occur at non-punctuation locations of the text, highlighting the limitations of using only punctuation-based cues. Our model combines lexical, syntactic, and contextual features with a novel breath duration feature that captures syllable load since the last major boundary. This cognitively motivated approach can model both allowed and “forbidden” pauses. The proposed framework supports applications such as child-directed TTS and oral reading fluency assessment where the proper grouping of words is considered critical to reading comprehension.