‘... like a needle in a haystack”: Annotation and Classification of Comparative Statements

Pritha Majumdar, Franziska Pannach, Arianna Graciotti, Johan Bos


Abstract
We present a clear distinction between the phenomena of comparisons and similes along with a fine-grained annotation guideline that facilitates the structural annotation and assessment of the two classes, with three major contributions: 1) a publicly available annotated data set of 100 comparative statements; 2) theoretically grounded annotation guidelines for human annotators; and 3) results of machine learning experiments to establish how the–often subtle–distinction between the two phenomena can be automated.
Anthology ID:
2025.latechclfl-1.23
Volume:
Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025)
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Anna Kazantseva, Stan Szpakowicz, Stefania Degaetano-Ortlieb, Yuri Bizzoni, Janis Pagel
Venues:
LaTeCHCLfL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
261–271
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.latechclfl-1.23/
DOI:
Bibkey:
Cite (ACL):
Pritha Majumdar, Franziska Pannach, Arianna Graciotti, and Johan Bos. 2025. ‘... like a needle in a haystack”: Annotation and Classification of Comparative Statements. In Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), pages 261–271, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
‘… like a needle in a haystack”: Annotation and Classification of Comparative Statements (Majumdar et al., LaTeCHCLfL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.latechclfl-1.23.pdf