Tongnian Wang

2026

Telling Speculative Stories to Help Humans Imagine the Harms of Healthcare AI
Xingmeng Zhao | Tongnian Wang | Dan Schumacher | Veronica Rammouz | Anthony Rios
Findings of the Association for Computational Linguistics: ACL 2026

Artificial intelligence (AI) is rapidly transforming healthcare, enabling the fast development of tools such as stress monitors, wellness trackers, and mental health chatbots. However, this rapid and low-barrier development can also introduce risks, including bias, privacy violations, and unequal access, especially when systems overlook real-world contexts, diverse user needs, and cultural settings. Many recent approaches use AI to identify such risks automatically, but this can reduce human engagement in understanding how harms arise, who they affect, and which stakeholder needs remain unspoken. We present a human-centered ethical foresight framework that generates speculative user stories and supports multi-agent discussions to help people reflect on potential benefits and harms of healthcare AI before deployment. In a user study, participants who engaged with stories identified a broader range of harms, distributing their responses more evenly across all 17 harm types, whereas those who did not engage with stories focused primarily on privacy and well-being (79.1%). Overall, our findings suggest that storytelling helps people anticipate potential risks and benefits and reflect more broadly on how AI systems may affect different users, contexts, and often unspoken needs.

2025

pdf bib abs

Simplified Rewriting Improves Expert Summarization
Xingmeng Zhao | Tongnian Wang | Anthony Rios
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Radiology report summarization (RRS) is critical for clinical workflows, requiring concise Impressions “distilled from detailed Findings.” This paper proposes a novel prompting strategy that enhances RRS by introducing a layperson summary as an intermediate step. This summary helps normalize key observations and simplify complex terminology using communication techniques inspired by doctor–patient interactions. Combined with few-shot in-context learning, this approach improves the model’s ability to map generalized descriptions to specific clinical findings. We evaluate our method on three benchmark datasets, MIMIC-CXR, CheXpert, and MIMIC-III, and compare it against state-of-the-art open-source language models in the 7B/8B parameter range, such as Llama-3.1-8B-Instruct. Results show consistent improvements in summarization quality, with gains of up to 5% on some metrics for prompting, and more than 20% for some models when instruction tuning.

2023

pdf bib

BabyStories: Can Reinforcement Learning Teach Baby Language Models to Write Better Stories?
Xingmeng Zhao | Tongnian Wang | Sheri Osborn | Anthony Rios
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning

pdf bib abs

UTSA-NLP at RadSum23: Multi-modal Retrieval-Based Chest X-Ray Report Summarization
Tongnian Wang | Xingmeng Zhao | Anthony Rios
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Radiology report summarization aims to automatically provide concise summaries of radiology findings, reducing time and errors in manual summaries. However, current methods solely summarize the text, which overlooks critical details in the images. Unfortunately, directly using the images in a multimodal model is difficult. Multimodal models are susceptible to overfitting due to their increased capacity, and modalities tend to overfit and generalize at different rates. Thus, we propose a novel retrieval-based approach that uses image similarities to generate additional text features. We further employ few-shot with chain-of-thought and ensemble techniques to boost performance. Overall, our method achieves state-of-the-art performance in the F1RadGraph score, which measures the factual correctness of summaries. We rank second place in both MIMIC-CXR and MIMIC-III hidden tests among 11 teams.

Co-authors

Venues

IJCNLP1

WS1

Fix author