Srinivas Ramesh Kamath


2026

We describe and evaluate two different architectures for creating book highlights from unstructured data. Given the prevalence of large language models, we examine whether a pipeline-based approach with intermediate steps for text generation is still necessary and whether it continues to offer any benefits over an end-to-end approach. Our comparative evaluations using LLM-as-a-judge across multiple models with different parameter sizes and generation scenarios show that highlights generated by the end-to-end approach are preferred. However, there is a slight but consistent increase in faithfulness for the pipeline-generated highlights when generating at a thematic level. Additionally, our analysis across multiple models shows that while larger models are more faithful, the degree of faithfulness increases when they are used with a pipeline architecture. The findings from our work indicate that whilst there is comparability between the two approaches, the greater faithfulness, controllability, and observability of pipeline-based approaches offer tangible benefits in applied settings.

2024

We describe our implementation and evaluation of the Hotel Highlights system which has been deployed live by trivago. This system leverages a large language model (LLM) to generate a set of highlights from accommodation descriptions and reviews, enabling travellers to quickly understand its unique aspects. In this paper, we discuss our motivation for building this system and the human evaluation we conducted, comparing the generated highlights against the source input to assess the degree of hallucinations and/or contradictions present. Finally, we outline the lessons learned and the improvements needed.