The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization

Simeng Sun; Ani Nenkova

doi:10.18653/v1/D19-1116

The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization

Abstract

ROUGE is widely used to automatically evaluate summarization systems. However, ROUGE measures semantic overlap between a system summary and a human reference on word-string level, much at odds with the contemporary treatment of semantic meaning. Here we present a suite of experiments on using distributed representations for evaluating summarizers, both in reference-based and in reference-free setting. Our experimental results show that the max value over each dimension of the summary ELMo word embeddings is a good representation that results in high correlation with human ratings. Averaging the cosine similarity of all encoders we tested yields high correlation with manual scores in reference-free setting. The distributed representations outperform ROUGE in recent corpora for abstractive news summarization but are less good on test data used in past evaluations.

Anthology ID:: D19-1116
Volume:: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:: EMNLP | IJCNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1216–1221
Language:
URL:: https://preview.aclanthology.org/ingest_wac_2008/D19-1116/
DOI:: 10.18653/v1/D19-1116
Bibkey:
Cite (ACL):: Simeng Sun and Ani Nenkova. 2019. The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 1216–1221, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization (Sun & Nenkova, EMNLP-IJCNLP 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest_wac_2008/D19-1116.pdf
Attachment:: D19-1116.Attachment.pdf

PDF Cite Search Attachment Fix data