Heyuan Huang

JHU

Other people with similar names: Heyuan Huang

Unverified author pages with similar names: Heyuan Huang

2026

MedScore: Generalizable Factuality Evaluation of Open-ended Long-form Medical Answers by Domain-adapted Claim Decomposition and Verification
Heyuan Huang | Alexandra DeLucia | Vijay Murari Tiyyala | Mark Dredze
Findings of the Association for Computational Linguistics: ACL 2026

While Large Language Models (LLMs) can generate fluent and convincing responses, they are not necessarily correct. This is especially apparent in the popular decompose-then-verify factuality evaluation pipeline, where LLMs evaluate generated text by decomposing it into individual, valid claims. Factuality evaluation is especially important for medical answers, since incorrect medical information could seriously harm the patient. However, existing factuality systems are a poor match for the medical domain, as they are typically only evaluated on objective, entity-centric, formulaic texts such as biographies and historical topics. This differs from condition-dependent, conversational, hypothetical, sentence-structure diverse, and subjective medical answers, making decomposition into valid facts challenging. We propose MedScore, a new pipeline to decompose medical answers into condition-aware valid facts and verify against in-domain corpora. Our method extracts up to three times as many valid facts as existing methods, reducing hallucination and vague references, and retaining condition-dependency in facts. We also find MedScore is generalizable to non-medical domains without any specific tuning. The resulting factuality score substantially varies by decomposition method, verification corpus, and used backbone LLM, highlighting the importance of customizing each step for reliable factuality evaluation by using our generalizable and modularized pipeline for domain adaptation.

2023

pdf bib abs

Medical Text Simplification: Optimizing for Readability with Unlikelihood Training and Reranked Beam Search Decoding
Lorenzo Jaime Flores | Heyuan Huang | Kejian Shi | Sophie Chheang | Arman Cohan
Findings of the Association for Computational Linguistics: EMNLP 2023

Text simplification has emerged as an increasingly useful application of AI for bridging the communication gap in specialized fields such as medicine, where the lexicon is often dominated by technical jargon and complex constructs. Despite notable progress, methods in medical simplification sometimes result in the generated text having lower quality and diversity. In this work, we explore ways to further improve the readability of text simplification in the medical domain. We propose (1) a new unlikelihood loss that encourages generation of simpler terms and (2) a reranked beam search decoding method that optimizes for simplicity, which achieve better performance on readability metrics on three datasets. This study’s findings offer promising avenues for improving text simplification in the medical field.

Co-authors

Kejian Shi 1

Vijay Murari Tiyyala 1

Venues

Findings2

Fix author