Team Aurum at MedExACT 2026@ACL: Data Augmentation and Clinical Longformer Fine-Tuning for Medical Decision Extraction

Jyoti Kumari; Vinay Ulli; Anindita Mondal

Team Aurum at MedExACT 2026@ACL: Data Augmentation and Clinical Longformer Fine-Tuning for Medical Decision Extraction

Jyoti Kumari, Vinay Ulli, Anindita Mondal

Abstract

This paper describes the system submitted by team Aurum to the Medical Decision Extraction, Analysis, and Classification Task (MedExACT) at BioNLP 2026. The task requires the extraction and classification of contiguous text spans representing medical decisions from lengthy ICU discharge summaries. To address the dual challenges of long document lengths and severe class imbalance withina limited training set of 350 notes, we propose a two-pronged strategy. First, we employ a tripartite data augmentation pipeline utilizing rule-based entity replacement, LLM-based contextual paraphrasing, and synthetic note generation to expand the training data to over 2,300 notes. Second, we fine-tune a domain-specific Clinical Longformer model equipped with a sliding-window inference mechanism and Focal Loss to handle sequences up to 2,048 tokens while focusing on rare decision categories. Paired with a targeted post-processing module,our system achieved a Final Score of 0.5251, demonstrating high token-level detection (Token F1: 0.6311) and strong stability across patient demographics.

Anthology ID:: 2026.bionlp-2.29
Volume:: Proceedings of the BioNLP 2026 (Shared Tasks)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Deepak Gupta, Dina Demner-Fushman
Venues:: BioNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 224–228
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-2.29/
DOI:
Bibkey:
Cite (ACL):: Jyoti Kumari, Vinay Ulli, and Anindita Mondal. 2026. Team Aurum at MedExACT 2026@ACL: Data Augmentation and Clinical Longformer Fine-Tuning for Medical Decision Extraction. In Proceedings of the BioNLP 2026 (Shared Tasks), pages 224–228, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Team Aurum at MedExACT 2026@ACL: Data Augmentation and Clinical Longformer Fine-Tuning for Medical Decision Extraction (Kumari et al., BioNLP 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-2.29.pdf
Supplementarymaterial:: 2026.bionlp-2.29.SupplementaryMaterial.txt

PDF Cite Search Supplementarymaterial Fix data