CEval: A Benchmark for Evaluating Counterfactual Text Generation

Van Bach Nguyen; Christin Seifert; Jörg Schlötterer

CEval: A Benchmark for Evaluating Counterfactual Text Generation

Van Bach Nguyen, Christin Seifert, Jörg Schlötterer

Abstract

Counterfactual text generation aims to minimally change a text, such that it is classified differently. Assessing progress in method development for counterfactual text generation is hindered by a non-uniform usage of data sets and metrics in related work. We propose CEval, a benchmark for comparing counterfactual text generation methods. CEval unifies counterfactual and text quality metrics, includes common counterfactual datasets with human annotations, standard baselines (MICE, GDBA, CREST) and the open-source language model LLAMA-2. Our experiments found no perfect method for generating counterfactual text. Methods that excel at counterfactual metrics often produce lower-quality text while LLMs with simple prompts generate high-quality text but struggle with counterfactual criteria. By making CEval available as an open-source Python library, we encourage the community to contribute additional methods and maintain consistent evaluation in future work.

Anthology ID:: 2024.inlg-main.6
Volume:: Proceedings of the 17th International Natural Language Generation Conference
Month:: September
Year:: 2024
Address:: Tokyo, Japan
Editors:: Saad Mahamood, Nguyen Le Minh, Daphne Ippolito
Venue:: INLG
SIG:: SIGGEN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 55–69
Language:
URL:: https://preview.aclanthology.org/add_missing_videos/2024.inlg-main.6/
DOI:
Bibkey:
Cite (ACL):: Van Bach Nguyen, Christin Seifert, and Jörg Schlötterer. 2024. CEval: A Benchmark for Evaluating Counterfactual Text Generation. In Proceedings of the 17th International Natural Language Generation Conference, pages 55–69, Tokyo, Japan. Association for Computational Linguistics.
Cite (Informal):: CEval: A Benchmark for Evaluating Counterfactual Text Generation (Nguyen et al., INLG 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/add_missing_videos/2024.inlg-main.6.pdf
Supplementary attachment:: 2024.inlg-main.6.Supplementary_Attachment.pdf

PDF Search Supplementary attachment Fix data