Yuhao Chen
Papers on this page may belong to the following people: Yuhao Chen, Yuhao Chen
2026
Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
Clea Chataigner | Rebecca Ma | Prakhar Ganesh | Yuhao Chen | Afaf Taik | Elliot Creager | Golnoosh Farnadi
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Clea Chataigner | Rebecca Ma | Prakhar Ganesh | Yuhao Chen | Afaf Taik | Elliot Creager | Golnoosh Farnadi
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) are highly sensitive to subtle changes in prompt phrasing, posing challenges for reliable auditing. Prior methods often apply unconstrained prompt paraphrasing, which risk missing linguistic and demographic factors that shape authentic user interactions. We introduce AUGMENT (Automated User-Grounded Modeling and Evaluation of Natural Language Transformations), a framework for generating controlled paraphrases, grounded in user behaviors. AUGMENT leverages linguistically informed rules and enforces quality through checks on instruction adherence, semantic similarity, and realism, ensuring paraphrases are both reliable and meaningful for auditing. Through case studies on the BBQ and MMLU datasets, we show that controlled paraphrases uncover systematic weaknesses that remain obscured under unconstrained variation. These results highlight the value of the AUGMENT framework for reliable auditing.
2025
Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks
Yuanjie Lyu | Chao Zhang | Yuhao Chen | Yong Chen | Tong Xu
Findings of the Association for Computational Linguistics: ACL 2025
Yuanjie Lyu | Chao Zhang | Yuhao Chen | Yong Chen | Tong Xu
Findings of the Association for Computational Linguistics: ACL 2025
In Retrieval-Augmented Generation (RAG) and agent-based frameworks, the “Chain of Models” approach is widely used, where multiple specialized models work sequentially on distinct sub-tasks. This approach is effective but increases resource demands as each model must be deployed separately. Recent advancements attempt to address this by applying prompt tuning, which allows a shared base model to adapt to multiple tasks with minimal parameter changes. However, a key challenge remains: intermediate outputs, passed between models as plain text, require recomputation of hidden states (i.e., Key and Value (KV) states in Transformers) during inference. In this paper, we introduce FTHSS, a novel prompt-tuning method that enables models to share KV hidden states, eliminating redundant forward passes and reducing KV cache storage. By modifying input and attention masks during training, FTHSS allows models to effectively utilize KV hidden states from prior models in both single- and multi-round scenarios. Empirical results on four tasks show that FTHSS matches the performance of traditional model chains while improving inference efficiency.