Bernardo Ramos


2024

pdf
Optimizing LLM Based Retrieval Augmented Generation Pipelines in the Financial Domain
Yiyun Zhao | Prateek Singh | Hanoz Bhathena | Bernardo Ramos | Aviral Joshi | Swaroop Gadiyaram | Saket Sharma
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 6: Industry Track)

Retrieval Augmented Generation (RAG) is a prominent approach in real-word applications for grounding large language model (LLM) generations in up to date and domain-specific knowledge. However, there is a lack of systematic investigations of the impact of each component (retrieval quality, prompts, generation models) on the generation quality of a RAG pipeline in real world scenarios. In this study, we benchmark 6 LLMs in 15 retrieval scenarios exploring 9 prompts over 2 real world financial domain datasets. We thoroughly discuss the impact of each component in RAG pipeline on answer generation quality and formulate specific recommendations for the design of RAG systems.