Evaluation and Continual Improvement for an Enterprise AI Assistant
Akash Maharaj, Kun Qian, Uttaran Bhattacharya, Sally Fang, Horia Galatanu, Manas Garg, Rachel Hanessian, Nishant Kapoor, Ken Russell, Shivakumar Vaithyanathan, Yunyao Li
Abstract
The development of conversational AI assistants is an iterative process with many components involved. As such, the evaluation and continual improvement of these assistants is a complex and multifaceted problem. This paper introduces the challenges in evaluating and improving a generative AI assistant for enterprise that is under active development and how we address these challenges. We also share preliminary results and discuss lessons learned.- Anthology ID:
- 2024.dash-1.3
- Volume:
- Proceedings of the Fifth Workshop on Data Science with Human-in-the-Loop (DaSH 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Eduard Dragut, Yunyao Li, Lucian Popa, Slobodan Vucetic, Shashank Srivastava
- Venues:
- DaSH | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 17–24
- Language:
- URL:
- https://aclanthology.org/2024.dash-1.3
- DOI:
- 10.18653/v1/2024.dash-1.3
- Cite (ACL):
- Akash Maharaj, Kun Qian, Uttaran Bhattacharya, Sally Fang, Horia Galatanu, Manas Garg, Rachel Hanessian, Nishant Kapoor, Ken Russell, Shivakumar Vaithyanathan, and Yunyao Li. 2024. Evaluation and Continual Improvement for an Enterprise AI Assistant. In Proceedings of the Fifth Workshop on Data Science with Human-in-the-Loop (DaSH 2024), pages 17–24, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Evaluation and Continual Improvement for an Enterprise AI Assistant (Maharaj et al., DaSH-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.dash-1.3.pdf