uppdf
bib
Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025)
Chloé Braud
|
Yang Janet Liu
|
Philippe Muller
|
Amir Zeldes
|
Chuyuan Li
pdf
bib
abs
The DISRPT 2025 Shared Task on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification
Chloé Braud
|
Amir Zeldes
|
Chuyuan Li
|
Yang Janet Liu
|
Philippe Muller
In 2025, we held the fourth iteration of the DISRPT Shared Task (Discourse Relation Parsing and Treebanking) dedicated to discourse parsing across formalisms. Following the success of the 2019, 2021, and 2023 tasks on Elementary Discourse Unit Segmentation, Connective Detection, and Relation Classification, this iteration added 13 new datasets, including three new languages (Czech, Polish, Nigerian Pidgin) and two new frameworks: the ISO framework and Enhanced Rhetorical Structure Theory, in addition to the previously included frameworks: RST, SDRT, DEP, and PDTB. In this paper, we review the data included in DISRPT 2025, which covers 39 datasets across 16 languages, survey and compare submitted systems, and report on system performance on each task for both treebanked and plain-tokenized versions of the data. The best systems obtain a mean accuracy of 71.19% for relation classification, a mean F1 of 91.57 (Treebanked Track) and 87.38 (Plain Track) for segmentation, and a mean F1 of 81.53 (Treebanked Track) and 79.92 (Plain Track) for connective identification. The data and trained models of several participants can be found at https://huggingface.co/multilingual-discourse-hub.
pdf
bib
abs
DisCuT and DiscReT: MELODI at DISRPT 2025 Multilingual discourse segmentation, connective tagging and relation classification
Robin Pujol
|
Firmin Rousseau
|
Philippe Muller
|
Chloé Braud
This paper presents the results obtained by the MELODI team for the three tasks proposed within the DISRPT 2025 shared task on discourse: segmentation, connective identification, and relation classification. The competition involves corpora in various languages, in several underlying frameworks, and datasets are given with or without sentence segmentation. This year, for the ranked, closed track, the campaign adds as a constraint to train only one model for each task, with an upper bound on the size of the model (no more than 4B parameters).An additional open track authorizes any size of, possibly non public, models that will not be reproduced by the organizers and thus not ranked.We compared several fine-tuning approaches either based on encoder-only transformer-based models, or auto-regressive generative ones. To be able to train one model on the variety of corpora, we explored various ways of combining data – by framework, language or language groups, with different sequential orderings –, and the addition of features to guide the model. For the closed track, our final submitted system is based on XLM-RoBERTa large for relation identification, and on InfoXLM for segmentation and connective identification. Our experiments demonstrate that building a single, multilingual model does not necessarily degrade the performance compared to language-specific systems, with at best 64.06% for relation identification, 90.19% for segmentation and 81.15% for connective identification (on average on the development sets), results that are similar or higher that the ones obtained in previous campaigns.We also found that a generative approach could give even higher results on relation identification, with at best 64.65% on the dev sets.
pdf
bib
abs
CLaC at DISRPT 2025: Hierarchical Adapters for Cross-Framework Multi-lingual Discourse Relation Classification
Nawar Turk
|
Daniele Comitogianni
|
Leila Kosseim
We present our submission to Task 3 (Discourse Relation Classification) of the DISRPT 2025 shared task. Task 3 introduces a unified set of 17 discourse relation labels across 39 corpora in 16 languages and six discourse frameworks, posing significant multilingual and cross‐formalism challenges. We first benchmark the task by fine‐tuning multilingual BERT‐based models (mBERT, XLM‐RoBERTa‐Base, and XLM‐RoBERTa‐Large) with two argument‐ordering strategies and progressive unfreezing ratios to establish strong baselines. We then evaluate prompt‐based large language models (namely Claude Opus 4.0) in zero‐shot and few‐shot settings to understand how LLMs respond to the newly proposed unified labels. Finally, we introduce HiDAC, a Hierarchical Dual‐Adapter Contrastive learning model. Results show that while larger transformer models achieve higher accuracy, the improvements are modest, and that unfreezing the top 75% of encoder layers yields performance comparable to full fine‐tuning while training far fewer parameters. Prompt‐based models lag significantly behind fine‐tuned transformers, and HiDAC achieves the highest overall accuracy (67.5%) while remaining more parameter‐efficient than full fine‐tuning.
pdf
bib
abs
DeDisCo at the DISRPT 2025 Shared Task: A System for Discourse Relation Classification
Zhuoxuan Ju
|
Jingni Wu
|
Abhishek Purushothama
|
Amir Zeldes
This paper presents DeDisCo, Georgetown University’s entry in the DISRPT 2025 shared task on discourse relation classification. We test two approaches, using an mt5-based encoder and a decoder based approach using the openly available Qwen model. We also experiment on training with augmented dataset for low-resource languages using matched data translated automatically from English, as well as using some additional linguistic features inspired by entries in previous editions of the Shared Task. Our system achieves a macro-accuracy score of 71.28, and we provide some interpretation and error analysis for our results.
pdf
bib
abs
HITS at DISRPT 2025: Discourse Segmentation, Connective Detection, and Relation Classification
Souvik Banerjee
|
Yi Fan
|
Michael Strube
This paper describes the submission of the HITS team to the DISRPT 2025 shared task. The shared task includes three sub-tasks: (1) discourse unit segmentation across formalisms, (2) cross-lingual discourse connective identification, and (3) cross-formalism discourse relation classification. This paper presents our strategies for the DISRPT 2025 Shared Task. In Task 1, our approach involves fine-tuning through multilingual joint training on linguistically motivated language groups. We incorporated two key techniques to improve model performance: a weighted loss function to address the task’s significant class imbalance and Fast Gradient Method (FGM) adversarial training to boost the model’s robustness. In task 2, our approach involves building an ensemble of three encoder models whose embeddings are smartly fused together with a multi-head attention layer. We also add Part-Of-Speech tags and dependency relations present in the training file as linguistic features. A CRF layer is added after the classification layer to account for dependencies between adjacent labels. To account for label imbalance, we use focal loss and label smoothing. This ensures our model is robust and flexible enough to handle different languages. In task 3, we use two-stage fine-tuning framework designed to transfer the nuanced reasoning capabilities of a very large “teacher” model to a compact “student” model so that the smaller model can learn complex discourse relationships. The fine-tuning process follows a curriculum learning framework. In such a framework the model learns to perform increasingly harder tasks. In our case, the model first learns to look at the discourse units and then predict the label followed by looking at Chain-Of-Thought reasoning for harder examples. This way it can learn to internalise such reasoning and increase prediction accuracy on the harder samples.
pdf
bib
abs
SeCoRel: Multilingual Discourse Analysis in DISRPT 2025
Sobha Lalitha Devi
|
Pattabhi Rk Rao
|
Vijay Sundar Ram
The work presented here describes our participation in DISRPT 2025 shared task in three tasks, Task1: Discourse Unit Segmentation across Formalisms, Task 2: Discourse Connective Identification across Languages and Task 3: Discourse Relation Classification across Formalisms. We have fine-tuned XLM-RoBERTa, a language model to address these three tasks. We have come up with one single multilingual language model for each task. Our system handles data in both the formats .conllu and .tok and different discourse formalisms. We have obtained encouraging results. The performance on test data in the three tasks is similar to the results obtained for the development data.