Min-Hsuan Ku

2026

LLATMU at #SMM4H-HeaRD 2026: Clinical Text Structuring with QLoRA-based Generation and Partial-Label TNM Classification
Eric Hsiao | Min-Hsuan Ku | Hsuan-Lei Shao
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks

We describe the LLATMU systems submitted to the #SMM4H-HeaRD 2026 shared tasks, covering two related clinical text structuring problems: dialogue-to-SOAP note generation (Task 4) and TNM staging classification from pathology reports (Task 6). Although the two tasks differ in modeling paradigm (text generation versus supervised classification), both require transforming unstructured clinical narratives into structured representations.For Task 4, we instruction-tuned LLMs with parameter-efficient adaptation and submitted a QLoRA-based Ministral-3B system, achieving an official blind test average score of 0.53 and outperforming the task-wide mean and median. For Task 6, we formulate TNM prediction as a three-head classification problem using BioClinical-ModernBERT-large with long-context encoding, class-weighted loss, and normalized partial-label training. The model achieves a validation average macro-F1 of 0.9196 and continues to outperform the official baseline on the more challenging tie-break test set.Across both tasks, our results suggest that robust data handling, stable fine-tuning, and task-appropriate supervision are important for practical clinical NLP under constrained and imperfect shared-task settings.

Co-authors

Eric Hsiao 1
Hsuan-Lei Shao 1

Venues

SMM4H1
WS1

Fix author