@inproceedings{hanken-2026-agentic,
title = "Agentic {AI} Architectures for {SOAP} Note Generation",
author = "Hanken, Keno",
editor = "Demner-Fushman, Dina and
Ananiadou, Sophia and
Roberts, Kirk and
Tsujii, Junichi",
booktitle = "{B}io{NLP} 2026",
month = jul,
year = "2026",
address = "San Diego, California",
publisher = "Association for Computational Linguistics",
url = "https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.61/",
pages = "742--752",
ISBN = "979-8-89176-434-7",
abstract = "Clinical documentation places significant time demands on medical professionals, consumes institutional resources, and is prone to errors that may compromise patient care. Recent advances in LLMs offer promising approaches for automating clinical note generation; however, the impact of different AI architectural designs remains underexplored, particularly for agentic AI systems. This study compares three architectures ? single-LLM, multi-agentic, and swarm-agentic ? for automated SOAP (Subjective, Objective, Assessment, Plan) note generation from doctor?patient dialogues. All approaches employ QLoRA-finetuned Ministral 3 models (3B and 8B parameters) trained on the MedSynth dataset, comprising 10,030 dialogue?note pairs across 2,006 ICD-10 code classes. Performance is evaluated using ROUGE-1, ROUGE-2, ROUGE-L, and BERTScore against a lexical-overlap baseline (dialogue vs. ground-truth SOAP, no inference). Results show that all finetuned models substantially outperform the baseline, while differences between architectural variants remain marginal. The single-LLM setup achieves the strongest performance across all metrics; 3B and 8B variants perform nearly identically on semantic similarity (BERTScore), while ROUGE differences are small but statistically significant. Qualitative inspection further reveals that residual differences across architectures are driven primarily by shared dataset priors rather than by architectural reasoning capacity. The results are based on synthetic data without human evaluation and reflect architectural behavior only."
}Markdown (Informal)
[Agentic AI Architectures for SOAP Note Generation](https://preview.aclanthology.org/ingest-acl-workshops/2026.bionlp-1.61/) (Hanken, BioNLP 2026)
ACL