Marcos Estecha-Garitagoitia


2025

pdf bib
Context or Retrieval? Evaluating RAG Methods for Art and Museum QA System
Samuel Ramos-Varela | Jaime Bellver-Soler | Marcos Estecha-Garitagoitia | Luis Fernando D’Haro
Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology

Recent studies suggest that increasing the context window of language models could outperform retrieval-augmented generation (RAG) methods in certain tasks. However, in domains such as art and museums, where information is inherently multimodal, combining images and detailed textual descriptions, this assumption needs closer examination. To explore this, we compare RAG techniques with direct large-context input approaches for answering questions about artworks. Using a dataset of painting images paired with textual information, we develop a synthetic database of question-answer (QA) pairs for evaluating these methods. The focus is on assessing the efficiency and accuracy of RAG in retrieving and using relevant information compared to passing the entire textual context to a language model. Additionally, we experiment with various strategies for segmenting and retrieving text to optimise the RAG pipeline. The results aim to clarify the trade-offs between these approaches and provide valuable insights for interactive systems designed for art and museum contexts.