ProPara-CRTS: Canonical Referent Tracking for Reliable Evaluation of Entity State Tracking in Process Narratives

Bingyang Ye, Timothy Obiso, Jingxuan Tu, James Pustejovsky


Abstract
Despite the abundance of datasets for procedural texts such as cooking recipes, resources that capture full process narratives, paragraph-long descriptions that follow how multiple entities evolve across a sequence of steps, remain scarce.Although synthetic resources offer useful toy settings, they fail to capture the linguistic variability of naturally occurring prose. ProPara remains the only sizeable, naturally occurring corpus of process narratives, yet ambiguities and inconsistencies in its schema and annotations hinder reliable evaluation of its core task Entity State Tracking (EST).In this paper, we introduce a Canonical Referent Tracking Schema (CRTS) that assigns every surface mention to a unique, immutable discourse referent and records that referent’s existence and location at each step. Applying CRTS to ProPara, we release the re-annotated result as ProPara-CRTS. The new corpus resolves ambiguous participant mentions in ProPara and consistently boosts performance across a variety of models.This suggests that principled schema design and targeted re-annotation can unlock measurable improvements in EST, providing a sharper diagnostic of model capacity in process narratives understanding without any changes to model architecture.
Anthology ID:
2025.iwcs-1.23
Volume:
Proceedings of the 16th International Conference on Computational Semantics
Month:
September
Year:
2025
Address:
Düsseldorf, Germany
Editors:
Kilian Evang, Laura Kallmeyer, Sylvain Pogodalla
Venues:
IWCS | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
264–278
Language:
URL:
https://preview.aclanthology.org/iwcs-25-ingestion/2025.iwcs-1.23/
DOI:
Bibkey:
Cite (ACL):
Bingyang Ye, Timothy Obiso, Jingxuan Tu, and James Pustejovsky. 2025. ProPara-CRTS: Canonical Referent Tracking for Reliable Evaluation of Entity State Tracking in Process Narratives. In Proceedings of the 16th International Conference on Computational Semantics, pages 264–278, Düsseldorf, Germany. Association for Computational Linguistics.
Cite (Informal):
ProPara-CRTS: Canonical Referent Tracking for Reliable Evaluation of Entity State Tracking in Process Narratives (Ye et al., IWCS 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/iwcs-25-ingestion/2025.iwcs-1.23.pdf