Miguel Del-Agua


2025

pdf bib
Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications
Jean-Philippe Corbeil | Asma Ben Abacha | George Michalopoulos | Phillip Swazinna | Miguel Del-Agua | Jerome Tremblay | Akila Jeeson Daniel | Cari Bader | Kevin Cho | Pooja Krishnan | Nathan Bodenstab | Thomas Lin | Wenxuan Teng | Francois Beaulieu | Paul Vozila
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

Large language models (LLMs) such as GPT-4o and o1 have demonstrated strong performance on clinical natural language processing (NLP) tasks across multiple medical benchmarks. Nonetheless, two high-impact NLP tasks — structured tabular reporting from nurse dictations and medical order extraction from doctor-patient consultations — remain underexplored due to data scarcity and sensitivity, despite active industry efforts. Practical solutions to these real-world clinical tasks can significantly reduce the documentation burden on healthcare providers, allowing greater focus on patient care. In this paper, we investigate these two challenging tasks using private and open-source clinical datasets, evaluating the performance of both open- and closed-weight LLMs, and analyzing their respective strengths and limitations. Furthermore, we propose an agentic pipeline for generating realistic, non-sensitive nurse dictations, enabling structured extraction of clinical observations. To support further research in both areas, we release SYNUR and SIMORD, the first open-source datasets for nurse observation extraction and medical order extraction.

2023

pdf bib
Large Scale Sequence-to-Sequence Models for Clinical Note Generation from Patient-Doctor Conversations
Gagandeep Singh | Yue Pan | Jesus Andres-Ferrer | Miguel Del-Agua | Frank Diehl | Joel Pinto | Paul Vozila
Proceedings of the 5th Clinical Natural Language Processing Workshop

We present our work on building large scale sequence-to-sequence models for generating clinical note from patient-doctor conversation. This is formulated as an abstractive summarization task for which we use encoder-decoder transformer model with pointer-generator. We discuss various modeling enhancements to this baseline model which include using subword and multiword tokenization scheme, prefixing the targets with a chain-of-clinical-facts, and training with contrastive loss that is defined over various candidate summaries. We also use flash attention during training and query chunked attention during inference to be able to process long input and output sequences and to improve computational efficiency. Experiments are conducted on a dataset containing about 900K encounters from around 1800 healthcare providers covering 27 specialties. The results are broken down into primary care and non-primary care specialties. Consistent accuracy improvements are observed across both of these categories.