Ryan Shea

2023

pdf abs
Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning
Ryan Shea | Zhou Yu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Maintaining a consistent persona is a key quality for any open domain dialogue system. Current state-of-the-art systems do this by training agents with supervised learning or online reinforcement learning (RL). However, systems trained with supervised learning often lack consistency as they are never punished for uttering contradictions. Additional training with RL can alleviate some of these issues, however the training process is expensive. Instead, we propose an offline RL framework to improve the persona consistency of dialogue systems. Our framework allows us to combine the advantages of previous methods as we can inexpensively train our model on existing data as in supervised learning, while punishing and rewarding specific utterances as in RL. We also introduce a simple importance sampling method to reduce the variance of importance weights in offline RL training which we call Variance-Reducing MLE-Initialized (VaRMI) importance sampling. Our automatic and human evaluations show that our framework improves both the persona consistency and dialogue quality of a state-of-the-art social chatbot.

The increasing use of AI chatbots as conversation partners for second-language learners highlights the importance of providing effective feedback. To ensure a successful learning experience, it is essential for researchers and practitioners to understand the optimal timing, methods of delivery, and types of feedback that are most beneficial to learners. Synchronous grammar corrective feedback (CF) has been shown to be more effective than asynchronous methods in online writing tasks. Additionally, self-correction by language learners has proven more beneficial than teacher-provided correction, particularly for spoken language skills and non-novice learners. However, existing language-learning AI chatbots often lack synchronous CF and self-correction capabilities. To address this, we propose a synchronous conversational corrective feedback (CCF) method, which allows self-correction and provides metalinguistic explanations (ME). Our study suggests that in chatbot-driven language-learning tools, corrective feedback is more effectively delivered through means other than the social chatbot, such as a GUI interface. Furthermore, we found that guided self-correction offers a superior learning experience compared to providing explicit corrections, particularly for learners with high learning motivation or lower linguistic ability.

2022

Protecting large language models from privacy leakage is becoming increasingly crucial with their wide adoption in real-world products. Yet applying *differential privacy* (DP), a canonical notion with provable privacy guarantees for machine learning models, to those models remains challenging due to the trade-off between model utility and privacy loss. Utilizing the fact that sensitive information in language data tends to be sparse, Shi et al. (2021) formalized a DP notion extension called *Selective Differential Privacy* (SDP) to protect only the sensitive tokens defined by a policy function. However, their algorithm only works for RNN-based models. In this paper, we develop a novel framework, *Just Fine-tune Twice* (JFT), that achieves SDP for state-of-the-art large transformer-based models. Our method is easy to implement: it first fine-tunes the model with *redacted* in-domain data, and then fine-tunes it again with the *original* in-domain data using a private training mechanism. Furthermore, we study the scenario of imperfect implementation of policy functions that misses sensitive tokens and develop systematic methods to handle it. Experiments show that our method achieves strong utility compared to previous baselines. We also analyze the SDP privacy guarantee empirically with the canary insertion attack.

Co-authors

Kai-Hui Liang 1

Sam Davidson 1

Xun Yuan 1

Shehan Panditharatne 1

Venues

emnlp2
bea1