Evaluation and Continual Improvement for an Enterprise AI Assistant

Akash Maharaj, Kun Qian, Uttaran Bhattacharya, Sally Fang, Horia Galatanu, Manas Garg, Rachel Hanessian, Nishant Kapoor, Ken Russell, Shivakumar Vaithyanathan, Yunyao Li


Abstract
The development of conversational AI assistants is an iterative process with many components involved. As such, the evaluation and continual improvement of these assistants is a complex and multifaceted problem. This paper introduces the challenges in evaluating and improving a generative AI assistant for enterprise that is under active development and how we address these challenges. We also share preliminary results and discuss lessons learned.
Anthology ID:
2024.dash-1.3
Volume:
Proceedings of the Fifth Workshop on Data Science with Human-in-the-Loop (DaSH 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Eduard Dragut, Yunyao Li, Lucian Popa, Slobodan Vucetic, Shashank Srivastava
Venues:
DaSH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17–24
Language:
URL:
https://aclanthology.org/2024.dash-1.3
DOI:
10.18653/v1/2024.dash-1.3
Bibkey:
Cite (ACL):
Akash Maharaj, Kun Qian, Uttaran Bhattacharya, Sally Fang, Horia Galatanu, Manas Garg, Rachel Hanessian, Nishant Kapoor, Ken Russell, Shivakumar Vaithyanathan, and Yunyao Li. 2024. Evaluation and Continual Improvement for an Enterprise AI Assistant. In Proceedings of the Fifth Workshop on Data Science with Human-in-the-Loop (DaSH 2024), pages 17–24, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Evaluation and Continual Improvement for an Enterprise AI Assistant (Maharaj et al., DaSH-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.dash-1.3.pdf