Validating Automatic Evaluation of Controllable Counterspeech Generation: Rankings Matter More Than Scores

Yi Zheng; Björn Ross; Walid Magdy

Validating Automatic Evaluation of Controllable Counterspeech Generation: Rankings Matter More Than Scores

Abstract

Counterspeech generation has emerged as a promising approach to combat online hate speech, with recent work focusing on controlling attributes used in counterspeech, such as strategies or intents. While these attributes are often evaluated automatically using classifiers, a key goal of this evaluation is to compare the performance of different generation models. However, the validity of such evaluation results is questionable when the classifiers themselves have only modest performance. This paper examines the automatic evaluation of counterspeech attributes using a multi-attribute counterspeech dataset containing 2,728 samples. We investigate when automatic evaluation can be trusted for model comparison and address the limitations of current evaluation methodologies. We make concrete recommendations for how to perform classifier validation before model evaluation. Our classifier validation results demonstrate that even limited classifiers can produce trustworthy model rankings. Therefore, we argue that when comparing counterspeech generation models, a classifier’s ability to rank generation models is a more direct measure of its practical utility than traditional classification metrics, e.g., accuracy and F1.

Anthology ID:: 2026.eacl-long.193
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4131–4146
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.193/
DOI:
Bibkey:
Cite (ACL):: Yi Zheng, Björn Ross, and Walid Magdy. 2026. Validating Automatic Evaluation of Controllable Counterspeech Generation: Rankings Matter More Than Scores. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4131–4146, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Validating Automatic Evaluation of Controllable Counterspeech Generation: Rankings Matter More Than Scores (Zheng et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.193.pdf

PDF Cite Search Fix data