Gengyu Wang


Benchmarking Language-agnostic Intent Classification for Virtual Assistant Platforms
Gengyu Wang | Cheng Qian | Lin Pan | Haode Qi | Ladislav Kunc | Saloni Potdar
Proceedings of the Workshop on Multilingual Information Access (MIA)

Current virtual assistant (VA) platforms are beholden to the limited number of languages they support. Every component, such as the tokenizer and intent classifier, is engineered for specific languages in these intricate platforms. Thus, supporting a new language in such platforms is a resource-intensive operation requiring expensive re-training and re-designing. In this paper, we propose a benchmark for evaluating language-agnostic intent classification, the most critical component of VA platforms. To ensure the benchmarking is challenging and comprehensive, we include 29 public and internal datasets across 10 low-resource languages and evaluate various training and testing settings with consideration of both accuracy and training time. The benchmarking result shows that Watson Assistant, among 7 commercial VA platforms and pre-trained multilingual language models (LMs), demonstrates close-to-best accuracy with the best accuracy-training time trade-off.

Distinguish Sense from Nonsense: Out-of-Scope Detection for Virtual Assistants
Cheng Qian | Haode Qi | Gengyu Wang | Ladislav Kunc | Saloni Potdar
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

Out of Scope (OOS) detection in Conversational AI solutions enables a chatbot to handle a conversation gracefully when it is unable to make sense of the end-user query. Accurately tagging a query as out-of-domain is particularly hard in scenarios when the chatbot is not equipped to handle a topic which has semantic overlap with an existing topic it is trained on. We propose a simple yet effective OOS detection method that outperforms standard OOS detection methods in a real-world deployment of virtual assistants. We discuss the various design and deployment considerations for a cloud platform solution to train virtual assistants and deploy them at scale. Additionally, we propose a collection of datasets that replicates real-world scenarios and show comprehensive results in various settings using both offline and online evaluation metrics.


Semantic Categorization of Social Knowledge for Commonsense Question Answering
Gengyu Wang | Xiaochen Hou | Diyi Yang | Kathleen McKeown | Jing Huang
Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing

Large pre-trained language models (PLMs) have led to great success on various commonsense question answering (QA) tasks in an end-to-end fashion. However, little attention has been paid to what commonsense knowledge is needed to deeply characterize these QA tasks. In this work, we proposed to categorize the semantics needed for these tasks using the SocialIQA as an example. Building upon our labeled social knowledge categories dataset on top of SocialIQA, we further train neural QA models to incorporate such social knowledge categories and relation information from a knowledge base. Unlike previous work, we observe our models with semantic categorizations of social knowledge can achieve comparable performance with a relatively simple model and smaller size compared to other complex approaches.


Soft Representation Learning for Sparse Transfer
Haeju Park | Jinyoung Yeo | Gengyu Wang | Seung-won Hwang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Transfer learning is effective for improving the performance of tasks that are related, and Multi-task learning (MTL) and Cross-lingual learning (CLL) are important instances. This paper argues that hard-parameter sharing, of hard-coding layers shared across different tasks or languages, cannot generalize well, when sharing with a loosely related task. Such case, which we call sparse transfer, might actually hurt performance, a phenomenon known as negative transfer. Our contribution is using adversarial training across tasks, to “soft-code” shared and private spaces, to avoid the shared space gets too sparse. In CLL, our proposed architecture considers another challenge of dealing with low-quality input.


Visual Choice of Plausible Alternatives: An Evaluation of Image-based Commonsense Causal Reasoning
Jinyoung Yeo | Gyeongbok Lee | Gengyu Wang | Seungtaek Choi | Hyunsouk Cho | Reinald Kim Amplayo | Seung-won Hwang
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)