@inproceedings{banerjee-etal-2025-hits,
title = "{HITS} at {DISRPT} 2025: Discourse Segmentation, Connective Detection, and Relation Classification",
author = "Banerjee, Souvik and
Fan, Yi and
Strube, Michael",
editor = "Braud, Chlo{\'e} and
Liu, Yang Janet and
Muller, Philippe and
Zeldes, Amir and
Li, Chuyuan",
booktitle = "Proceedings of the 4th Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2025)",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://preview.aclanthology.org/ingest-emnlp/2025.disrpt-1.5/",
pages = "63--78",
ISBN = "979-8-89176-344-9",
abstract = "This paper describes the submission of the HITS team to the DISRPT 2025 shared task. The shared task includes three sub-tasks: (1) discourse unit segmentation across formalisms, (2) cross-lingual discourse connective identification, and (3) cross-formalism discourse relation classification. This paper presents our strategies for the DISRPT 2025 Shared Task. In Task 1, our approach involves fine-tuning through multilingual joint training on linguistically motivated language groups. We incorporated two key techniques to improve model performance: a weighted loss function to address the task{'}s significant class imbalance and Fast Gradient Method (FGM) adversarial training to boost the model{'}s robustness. In task 2, our approach involves building an ensemble of three encoder models whose embeddings are smartly fused together with a multi-head attention layer. We also add Part-Of-Speech tags and dependency relations present in the training file as linguistic features. A CRF layer is added after the classification layer to account for dependencies between adjacent labels. To account for label imbalance, we use focal loss and label smoothing. This ensures our model is robust and flexible enough to handle different languages. In task 3, we use two-stage fine-tuning framework designed to transfer the nuanced reasoning capabilities of a very large ``teacher'' model to a compact ``student'' model so that the smaller model can learn complex discourse relationships. The fine-tuning process follows a curriculum learning framework. In such a framework the model learns to perform increasingly harder tasks. In our case, the model first learns to look at the discourse units and then predict the label followed by looking at Chain-Of-Thought reasoning for harder examples. This way it can learn to internalise such reasoning and increase prediction accuracy on the harder samples."
}Markdown (Informal)
[HITS at DISRPT 2025: Discourse Segmentation, Connective Detection, and Relation Classification](https://preview.aclanthology.org/ingest-emnlp/2025.disrpt-1.5/) (Banerjee et al., DISRPT 2025)
ACL