Abstract
Retrieval-augmented generation (RAG) combines knowledge from domain-specific sources into large language models to ground answer generation. Current RAG systems lack customizable visibility on the context documents and the model’s attentiveness towards such documents. We propose RAGViz, a RAG diagnosis tool that visualizes the attentiveness of the generated tokens in retrieved documents. With a built-in user interface, retrieval index, and Large Language Model (LLM) backbone, RAGViz provides two main functionalities: (1) token and document-level attention visualization, and (2) generation comparison upon context document addition and removal. As an open-source toolkit, RAGViz can be easily hosted with a custom embedding model and HuggingFace-supported LLM backbone. Using a hybrid ANN (Approximate Nearest Neighbor) index, memory-efficient LLM inference tool, and custom context snippet method, RAGViz operates efficiently with a median query time of about 5 seconds on a moderate GPU node. Our code is available at https://github.com/cxcscmu/RAGViz. A demo video of RAGViz can be found at https://youtu.be/cTAbuTu6ur4.- Anthology ID:
- 2024.emnlp-demo.33
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Delia Irazu Hernandez Farias, Tom Hope, Manling Li
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 320–327
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.emnlp-demo.33/
- DOI:
- 10.18653/v1/2024.emnlp-demo.33
- Cite (ACL):
- Tevin Wang, Jingyuan He, and Chenyan Xiong. 2024. RAGViz: Diagnose and Visualize Retrieval-Augmented Generation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 320–327, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- RAGViz: Diagnose and Visualize Retrieval-Augmented Generation (Wang et al., EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.emnlp-demo.33.pdf