Ziyan Jiang


2022

pdf
PENTATRON: PErsonalized coNText-Aware Transformer for Retrieval-based cOnversational uNderstanding
Niranjan Uma Naresh | Ziyan Jiang | Ankit Ankit | Sungjin Lee | Jie Hao | Xing Fan | Chenlei Guo
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

Conversational understanding is an integral part of modern intelligent devices. In a large fraction of the global traffic from customers using smart digital assistants, frictions in dialogues may be attributed to incorrect understanding of the entities in a customer’s query due to factors including ambiguous mentions, mispronunciation, background noise and faulty on-device signal processing. Such errors are compounded by two common deficiencies from intelligent devices namely, (1) the device not being tailored to individual customers, and (2) the device responses being unaware of the context in the conversation session. Viewing this problem via the lens of retrieval-based search engines, we build and evaluate a scalable entity correction system, PENTATRON. The system leverages a parametric transformer-based language model to learn patterns from in-session customer-device interactions coupled with a non-parametric personalized entity index to compute the correct query, which aids downstream components in reasoning about the best response. In addition to establishing baselines and demonstrating the value of personalized and context-aware systems, we use multitasking to learn the domain of the correct entity. We also investigate the utility of language model prompts. Through extensive experiments, we show a significant upward movement of the key metric (Exact Match) by up to 500.97% (relative to the baseline).

2021

pdf
#HowYouTagTweets: Learning User Hashtagging Preferences via Personalized Topic Attention
Yuji Zhang | Yubo Zhang | Chunpu Xu | Jing Li | Ziyan Jiang | Baolin Peng
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Millions of hashtags are created on social media every day to cross-refer messages concerning similar topics. To help people find the topics they want to discuss, this paper characterizes a user’s hashtagging preferences via predicting how likely they will post with a hashtag. It is hypothesized that one’s interests in a hashtag are related with what they said before (user history) and the existing posts present the hashtag (hashtag contexts). These factors are married in the deep semantic space built with a pre-trained BERT and a neural topic model via multitask learning. In this way, user interests learned from the past can be customized to match future hashtags, which is beyond the capability of existing methods assuming unchanged hashtag semantics. Furthermore, we propose a novel personalized topic attention to capture salient contents to personalize hashtag contexts. Experiments on a large-scale Twitter dataset show that our model significantly outperforms the state-of-the-art recommendation approach without exploiting latent topics.

pdf
Personalized Search-based Query Rewrite System for Conversational AI
Eunah Cho | Ziyan Jiang | Jie Hao | Zheng Chen | Saurabh Gupta | Xing Fan | Chenlei Guo
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

Query rewrite (QR) is an emerging component in conversational AI systems, reducing user defect. User defect is caused by various reasons, such as errors in the spoken dialogue system, users’ slips of the tongue or their abridged language. Many of the user defects stem from personalized factors, such as user’s speech pattern, dialect, or preferences. In this work, we propose a personalized search-based QR framework, which focuses on automatic reduction of user defect. We build a personalized index for each user, which encompasses diverse affinity layers to reflect personal preferences for each user in the conversational AI. Our personalized QR system contains retrieval and ranking layers. Supported by user feedback based learning, training our models does not require hand-annotated data. Experiments on personalized test set showed that our personalized QR system is able to correct systematic and user errors by utilizing phonetic and semantic inputs.