Multi-Dimensional Evaluation of Text Summarization with In-Context Learning

Sameer Jain; Vaishakh Keshava; Swarnashree Mysore Sathyendra; Patrick Fernandes; Pengfei Liu; Graham Neubig; Chunting Zhou

doi:10.18653/v1/2023.findings-acl.537

Multi-Dimensional Evaluation of Text Summarization with In-Context Learning

Sameer Jain, Vaishakh Keshava, Swarnashree Mysore Sathyendra, Patrick Fernandes, Pengfei Liu, Graham Neubig, Chunting Zhou

Abstract

Evaluation of natural language generation (NLG) is complex and multi-dimensional. Generated text can be evaluated for fluency, coherence, factuality, or any other dimensions of interest. Most frameworks that perform such multi-dimensional evaluation require training on large manually or synthetically generated datasets. In this paper, we study the efficacy of large language models as multi-dimensional evaluators using in-context learning, obviating the need for large training datasets. Our experiments show that in-context learning-based evaluators are competitive with learned evaluation frameworks for the task of text summarization, establishing state-of-the-art on dimensions such as relevance and factual consistency. We then analyze the effects of factors such as the selection and number of in-context examples on performance. Finally, we study the efficacy of in-context learning-based evaluators in evaluating zero-shot summaries written by large language models such as GPT-3.

Anthology ID:: 2023.findings-acl.537
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8487–8495
Language:
URL:: https://aclanthology.org/2023.findings-acl.537
DOI:: 10.18653/v1/2023.findings-acl.537
Bibkey:
Cite (ACL):: Sameer Jain, Vaishakh Keshava, Swarnashree Mysore Sathyendra, Patrick Fernandes, Pengfei Liu, Graham Neubig, and Chunting Zhou. 2023. Multi-Dimensional Evaluation of Text Summarization with In-Context Learning. In Findings of the Association for Computational Linguistics: ACL 2023, pages 8487–8495, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Multi-Dimensional Evaluation of Text Summarization with In-Context Learning (Jain et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-3/2023.findings-acl.537.pdf

PDF Search