2025
pdf
bib
abs
Simplified Rewriting Improves Expert Summarization
Xingmeng Zhao
|
Tongnian Wang
|
Anthony Rios
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Radiology report summarization (RRS) is critical for clinical workflows, requiring concise Impressions “distilled from detailed Findings.” This paper proposes a novel prompting strategy that enhances RRS by introducing a layperson summary as an intermediate step. This summary helps normalize key observations and simplify complex terminology using communication techniques inspired by doctor–patient interactions. Combined with few-shot in-context learning, this approach improves the model’s ability to map generalized descriptions to specific clinical findings. We evaluate our method on three benchmark datasets, MIMIC-CXR, CheXpert, and MIMIC-III, and compare it against state-of-the-art open-source language models in the 7B/8B parameter range, such as Llama-3.1-8B-Instruct. Results show consistent improvements in summarization quality, with gains of up to 5% on some metrics for prompting, and more than 20% for some models when instruction tuning.
2023
pdf
bib
abs
UTSA-NLP at RadSum23: Multi-modal Retrieval-Based Chest X-Ray Report Summarization
Tongnian Wang
|
Xingmeng Zhao
|
Anthony Rios
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Radiology report summarization aims to automatically provide concise summaries of radiology findings, reducing time and errors in manual summaries. However, current methods solely summarize the text, which overlooks critical details in the images. Unfortunately, directly using the images in a multimodal model is difficult. Multimodal models are susceptible to overfitting due to their increased capacity, and modalities tend to overfit and generalize at different rates. Thus, we propose a novel retrieval-based approach that uses image similarities to generate additional text features. We further employ few-shot with chain-of-thought and ensemble techniques to boost performance. Overall, our method achieves state-of-the-art performance in the F1RadGraph score, which measures the factual correctness of summaries. We rank second place in both MIMIC-CXR and MIMIC-III hidden tests among 11 teams.
pdf
bib
BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories?
Xingmeng Zhao
|
Tongnian Wang
|
Sheri Osborn
|
Anthony Rios
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning