CompUGE-Bench: Comparative Understanding and Generation Evaluation Benchmark for Comparative Question Answering

Ahmad Shallouf; Irina Nikishina; Chris Biemann

CompUGE-Bench: Comparative Understanding and Generation Evaluation Benchmark for Comparative Question Answering

Ahmad Shallouf, Irina Nikishina, Chris Biemann

Abstract

This paper presents CompUGE, a comprehensive benchmark designed to evaluate Comparative Question Answering (CompQA) systems. The benchmark is structured around four core tasks: Comparative Question Identification, Object and Aspect Identification, Stance Classification, and Answer Generation. It unifies multiple datasets and provides a robust evaluation platform to compare various models across these sub-tasks. We also create additional all-encompassing CompUGE datasets by filtering and merging the existing ones. The benchmark for comparative question answering sub-tasks is designed as a web application available on HuggingFace Spaces: https://huggingface.co/spaces/uhhlt/CompUGE-Bench

Anthology ID:: 2025.coling-demos.19
Volume:: Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert, Brodie Mather, Mark Dras
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 189–198
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2025.coling-demos.19/
DOI:
Bibkey:
Cite (ACL):: Ahmad Shallouf, Irina Nikishina, and Chris Biemann. 2025. CompUGE-Bench: Comparative Understanding and Generation Evaluation Benchmark for Comparative Question Answering. In Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations, pages 189–198, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: CompUGE-Bench: Comparative Understanding and Generation Evaluation Benchmark for Comparative Question Answering (Shallouf et al., COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2025.coling-demos.19.pdf

PDF Cite Search Fix data