Martyna Lewandowska

2026

This paper presents Fables-DTR, a corpus of Aesop’s fables annotated for discourse and temporal relations, designed to explore how event sequencing and aspectual features and discourse relations interact. Building on the ISO 24617 Semantic Annotation Framework, integrating Part 1 (Time and Events) and Part 8 (Discourse Relations), the resource provides a unified representation of discourse structure and temporal and aspectual features. The corpus comprises 15 fables in English, automatically translated into European Portuguese and Polish (45 texts in total), with all translations manually validated by native linguists to preserve semantic and discourse features. Each fable is annotated in two layers: (i) for discourse relations, argument roles, and signals; (ii) for temporal relations, and event attributes, such as Tense, Aspect, Polarity. The resulting dataset provides relevant information about the association between discourse relations and their temporal and aspectual features. Fables-DTR contributes both a valuable resource for cross-linguistic and narrative discourse analysis and empirical evidence for integrating ISO standards in multilayer annotation. It also provides a foundation for computational applications in discourse parsing, event ordering, and implicit relation detection.

2025

pdf bib abs

This study addresses the fundamental task of discourse unit detection – the critical initial step in discourse parsing. We analyze how various discourse frameworks conceptualize and structure discourse units, with a focus on their underlying taxonomies and theoretical assumptions. While approaches to discourse segmentation vary considerably, the extent to which these conceptual divergences influence practical implementations remains insufficiently studied. To address this gap, we investigate similarities and differences in segmentation across several English datasets, segmented and annotated according to distinct discourse frameworks, using a simple, rule-based heuristics. We evaluate the effectiveness of rules with respect to gold-standard segmentation, while also checking variability and cross-framework generalizability. Additionally, we conduct a manual comparison of a sample of rule-based segmentation outputs against benchmark segmentation, identifying points of convergence and divergence.Our findings indicate that discourse frameworks align strongly at the level of segmentation: particular clauses consistently serve as the primary boundaries of discourse units. Discrepancies arise mainly in the treatment of other structures, such as adpositional phrases, appositions, interjections, and parenthesised text segments, which are inconsistently marked as separate discourse units across formalisms.