Abstract
Retrieval-augmented question-answering systems combine retrieval techniques with large language models to provide answers that are more accurate and informative. Many existing toolkits allow users to quickly build such systems using off-the-shelf models, but they fall short in supporting researchers and developers to customize the *model training, testing, and deployment process*. We propose LocalRQA, an open-source toolkit that features a wide selection of model training algorithms, evaluation methods, and deployment tools curated from the latest research. As a showcase, we build QA systems using online documentation obtained from Databricks and Faire’s websites. We find 7B-models trained and deployed using LocalRQA reach a similar performance compared to using OpenAI’s text-ada-002 and GPT-4-turbo.- Anthology ID:
- 2024.acl-demos.14
- Volume:
- Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Yixin Cao, Yang Feng, Deyi Xiong
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 136–151
- Language:
- URL:
- https://aclanthology.org/2024.acl-demos.14
- DOI:
- 10.18653/v1/2024.acl-demos.14
- Cite (ACL):
- Xiao Yu, Yunan Lu, and Zhou Yu. 2024. LocalRQA: From Generating Data to Locally Training, Testing, and Deploying Retrieval-Augmented QA Systems. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 136–151, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- LocalRQA: From Generating Data to Locally Training, Testing, and Deploying Retrieval-Augmented QA Systems (Yu et al., ACL 2024)
- PDF:
- https://preview.aclanthology.org/autopr/2024.acl-demos.14.pdf