Catarina Belem


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Readability Reconsidered A Cross-Dataset Analysis of Reference-Free Metrics
Catarina Belem | Parker Glenn | Alfy Samuel | Anoop Kumar | Daben Liu
Proceedings of the Fourth Workshop on Text Simplification, Accessibility and Readability (TSAR 2025)

Automatic readability assessment plays a key role in ensuring effective communication between humans and language models. Despite significant progress the field is hindered by inconsistent definitions of readability and measurements that rely on surface-level text properties. In this work we investigate the factors shaping human perceptions of readability through the analysis of 1.2k judgments finding that beyond surface-level cues information content and topic strongly shape text comprehensibility. Furthermore we evaluate 15 popular readability metrics across 5 datasets contrasting them with 5 more nuanced model-based metrics. Our results show that four model-based metrics consistently place among the top 4 in rank correlations with human judgments while the best performing traditional metric achieves an average rank of 7.8. These findings highlight a mismatch between current readability metrics and human perceptions pointing to model-based approaches as a more promising direction.