Qian Hu


2023

pdf
Resolving Ambiguities in Text-to-Image Generative Models
Ninareh Mehrabi | Palash Goyal | Apurv Verma | Jwala Dhamala | Varun Kumar | Qian Hu | Kai-Wei Chang | Richard Zemel | Aram Galstyan | Rahul Gupta
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate the Text-to-image Ambiguity Benchmark (TAB) dataset to study different types of ambiguities in text-to-image generative models. We then propose the Text-to-ImagE Disambiguation (TIED) framework to disambiguate the prompts given to the text-to-image generative models by soliciting clarifications from the end user. Through automatic and human evaluations, we show the effectiveness of our framework in generating more faithful images aligned with end user intention in the presence of ambiguities.

2021

pdf
Collaborative Data Relabeling for Robust and Diverse Voice Apps Recommendation in Intelligent Personal Assistants
Qian Hu | Thahir Mohamed | Wei Xiao | Zheng Gao | Xibin Gao | Radhika Arava | Xiyao Ma | Mohamed AbdelHady
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Intelligent personal assistants (IPAs) such as Amazon Alexa, Google Assistant and Apple Siri extend their built-in capabilities by supporting voice apps developed by third-party developers. Sometimes the smart assistant is not able to successfully respond to user voice commands (aka utterances). There are many reasons including automatic speech recognition (ASR) error, natural language understanding (NLU) error, routing utterances to an irrelevant voice app or simply that the user is asking for a capability that is not supported yet. The failure to handle a voice command leads to customer frustration. In this paper, we introduce a fallback skill recommendation system to suggest a voice app to a customer for an unhandled voice command. One of the prominent challenges of developing a skill recommender system for IPAs is partial observation. To solve the partial observation problem, we propose collaborative data relabeling (CDR) method. In addition, CDR also improves the diversity of the recommended skills. We evaluate the proposed method both offline and online. The offline evaluation results show that the proposed system outperforms the baselines. The online A/B testing results show significant gain of customer experience metrics.

2004

pdf
Audio Hot Spotting and Retrieval using Multiple Features
Qian Hu | Fred Goodman | Stanley Boykin | Randy Fish | Warren Greiff
Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004