Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information

Jeremy Block; Yu-Peng Chen; Abhilash Budharapu; Lisa Anthony; Bonnie Dorr

doi:10.18653/v1/2023.eval4nlp-1.7

Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information

Jeremy Block, Yu-Peng Chen, Abhilash Budharapu, Lisa Anthony, Bonnie Dorr

Abstract

With the aim of improving work efficiency, we examine how Large Language Models (LLMs) can better support the handoff of information by summarizing user interactions in collaborative intelligence analysis communication. We experiment with interaction logs, or a record of user interactions with a system. Inspired by chain-of-thought prompting, we describe a technique to avoid API token limits with recursive summarization requests. We then apply ChatGPT over multiple iterations to extract named entities, topics, and summaries, combined with interaction sequence sentences, to generate summaries of critical events and results of analysis sessions. We quantitatively evaluate the generated summaries against human-generated ones using common accuracy metrics (e.g., ROUGE-L, BLEU, BLEURT, and TER). We also report qualitative trends and the factuality of the output. We find that manipulating the audience feature or providing single-shot examples minimally influences the model’s accuracy. While our methodology successfully summarizes interaction logs, the lack of significant results raises questions about prompt engineering and summarization effectiveness generally. We call on explainable artificial intelligence research to better understand how terms and their placement may change LLM outputs, striving for more consistent prompt engineering guidelines.

Anthology ID:: 2023.eval4nlp-1.7
Volume:: Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems
Month:: November
Year:: 2023
Address:: Bali, Indonesia
Editors:: Daniel Deutsch, Rotem Dror, Steffen Eger, Yang Gao, Christoph Leiter, Juri Opitz, Andreas Rücklé
Venues:: Eval4NLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 85–99
Language:
URL:: https://aclanthology.org/2023.eval4nlp-1.7
DOI:: 10.18653/v1/2023.eval4nlp-1.7
Bibkey:
Cite (ACL):: Jeremy Block, Yu-Peng Chen, Abhilash Budharapu, Lisa Anthony, and Bonnie Dorr. 2023. Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information. In Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems, pages 85–99, Bali, Indonesia. Association for Computational Linguistics.
Cite (Informal):: Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information (Block et al., Eval4NLP-WS 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-dup-bibkey/2023.eval4nlp-1.7.pdf

PDF Search