PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation

Ishan Jindal; Alexandre Rademaker; Khoi-Nguyen Tran; Huaiyu Zhu; Hiroshi Kanayama; Marina Danilevsky; Yunyao Li

doi:10.18653/v1/2023.findings-eacl.134

PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation

Ishan Jindal, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, Hiroshi Kanayama, Marina Danilevsky, Yunyao Li

Abstract

Semantic role labeling (SRL) identifies the predicate-argument structure in a sentence. This task is usually accomplished in four steps: predicate identification, predicate sense disambiguation, argument identification, and argument classification. Errors introduced at one step propagate to later steps. Unfortunately, the existing SRL evaluation scripts do not consider the full effect of this error propagation aspect. They either evaluate arguments independent of predicate sense (CoNLL09) or do not evaluate predicate sense at all (CoNLL05), yielding an inaccurate SRL model performance on the argument classification task. In this paper, we address key practical issues with existing evaluation scripts and propose a more strict SRL evaluation metric PriMeSRL. We observe that by employing PriMeSRL, the quality evaluation of all SoTA SRL models drops significantly, and their relative rankings also change. We also show that PriMeSRLsuccessfully penalizes actual failures in SoTA SRL models.

Anthology ID:: 2023.findings-eacl.134
Volume:: Findings of the Association for Computational Linguistics: EACL 2023
Month:: May
Year:: 2023
Address:: Dubrovnik, Croatia
Editors:: Andreas Vlachos, Isabelle Augenstein
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1806–1818
Language:
URL:: https://aclanthology.org/2023.findings-eacl.134
DOI:: 10.18653/v1/2023.findings-eacl.134
Bibkey:
Cite (ACL):: Ishan Jindal, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, Hiroshi Kanayama, Marina Danilevsky, and Yunyao Li. 2023. PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1806–1818, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):: PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation (Jindal et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/dois-2013-emnlp/2023.findings-eacl.134.pdf
Software:: 2023.findings-eacl.134.software.zip
Video:: https://preview.aclanthology.org/dois-2013-emnlp/2023.findings-eacl.134.mp4

PDF Search Software Video