Xavier Coubez

2026

Medical Context Variation: A source of impairment for Event classification
Aman Sinha | Marianne Clausel | Mathieu Constant | Xavier Coubez
BioNLP 2026

The variation in writing style encapsulates nuanced characteristics, which are often exploited for author or demographic identification. In the medical domain, language models are frequently deployed to capture relevant information from unstructured or complex data, such as clinical notes that often include patients’ medical histories. Such data is largely free-form and unstructured, obtained through diverse clinician?patient interactions. In this work, we present a case study investigating whether variations in clinicians’ writing styles can lead to differences in medical context understanding capabilities for pre-trained language models (PLMs) on downstream tasks, such as medical event classification. Our findings indicate that variation in writing style, characterized by linguistic features, can indeed lead to suboptimal performance in deployed systems. Furthermore, we explore linguistic guided counterfactual reasoning in order to mitigate the impact of writing style variation which suggests LLM-based stylistic normalization to be effective for this purpose.

2024

pdf bib abs

Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification?
Aman Sinha | Timothee Mickus | Marianne Clausel | Mathieu Constant | Xavier Coubez
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing

The success of pretrained language models (PLMs) across a spate of use-cases has led to significant investment from the NLP community towards building domain-specific foundational models. On the other hand, in mission critical settings such as biomedical applications, other aspects also factor in—chief of which is a model’s ability to produce reasonable estimates of its own uncertainty. In the present study, we discuss these two desiderata through the lens of how they shape the entropy of a model’s output probability distribution. We find that domain specificity and uncertainty awareness can often be successfully combined, but the exact task at hand weighs in much more strongly.

Co-authors

Venues

BioNLP2
WS2

Fix author