Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information
Jeremy Block, Yu-Peng Chen, Abhilash Budharapu, Lisa Anthony, Bonnie Dorr
Abstract
With the aim of improving work efficiency, we examine how Large Language Models (LLMs) can better support the handoff of information by summarizing user interactions in collaborative intelligence analysis communication. We experiment with interaction logs, or a record of user interactions with a system. Inspired by chain-of-thought prompting, we describe a technique to avoid API token limits with recursive summarization requests. We then apply ChatGPT over multiple iterations to extract named entities, topics, and summaries, combined with interaction sequence sentences, to generate summaries of critical events and results of analysis sessions. We quantitatively evaluate the generated summaries against human-generated ones using common accuracy metrics (e.g., ROUGE-L, BLEU, BLEURT, and TER). We also report qualitative trends and the factuality of the output. We find that manipulating the audience feature or providing single-shot examples minimally influences the model’s accuracy. While our methodology successfully summarizes interaction logs, the lack of significant results raises questions about prompt engineering and summarization effectiveness generally. We call on explainable artificial intelligence research to better understand how terms and their placement may change LLM outputs, striving for more consistent prompt engineering guidelines.- Anthology ID:
- 2023.eval4nlp-1.7
- Volume:
- Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems
- Month:
- November
- Year:
- 2023
- Address:
- Bali, Indonesia
- Editors:
- Daniel Deutsch, Rotem Dror, Steffen Eger, Yang Gao, Christoph Leiter, Juri Opitz, Andreas Rücklé
- Venues:
- Eval4NLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 85–99
- Language:
- URL:
- https://aclanthology.org/2023.eval4nlp-1.7
- DOI:
- 10.18653/v1/2023.eval4nlp-1.7
- Cite (ACL):
- Jeremy Block, Yu-Peng Chen, Abhilash Budharapu, Lisa Anthony, and Bonnie Dorr. 2023. Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information. In Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems, pages 85–99, Bali, Indonesia. Association for Computational Linguistics.
- Cite (Informal):
- Summary Cycles: Exploring the Impact of Prompt Engineering on Large Language Models’ Interaction with Interaction Log Information (Block et al., Eval4NLP-WS 2023)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2023.eval4nlp-1.7.pdf