Taiki Papandreou


2025

Jargon identification is critical for improving the accessibility of biomedical texts yet models are often evaluated on isolated datasets leaving open questions about generalization. After reproducing MedReadMes jargon detection results and extending evaluation to the PLABA dataset we find that transfer learning across datasets yields only modest gains largely due to divergent annotation objectives. Through manual re-annotation we show that aligning labeling schemes improves cross-dataset performance. Building on these findings we evaluate several jargon-aware prompting strategies for LLM-based medical text simplification. Explicitly highlighting jargon in prompts does not consistently improve simplification quality. When gains occur they often trade off against readability and are model-dependent. Human evaluation indicates that simple prompting can be as effective as more complex jargon-aware instructions. We release code to facilitate further research https//anonymous.4open.science/r/tsar-anonymous-2D66F/README.md