Comparative evaluation of boundary-relaxed annotation for Entity Linking performance

Gabriel Herman Bernardim Andrade, Shuntaro Yada, Eiji Aramaki


Abstract
Entity Linking performance has a strong reliance on having a large quantity of high-quality annotated training data available. Yet, manual annotation of named entities, especially their boundaries, is ambiguous, error-prone, and raises many inconsistencies between annotators. While imprecise boundary annotation can degrade a model’s performance, there are applications where accurate extraction of entities’ surface form is not necessary. For those cases, a lenient annotation guideline could relieve the annotators’ workload and speed up the process. This paper presents a case study designed to verify the feasibility of such annotation process and evaluate the impact of boundary-relaxed annotation in an Entity Linking pipeline. We first generate a set of noisy versions of the widely used AIDA CoNLL-YAGO dataset by expanding the boundaries subsets of annotated entity mentions and then train three Entity Linking models on this data and evaluate the relative impact of imprecise annotation on entity recognition and disambiguation performances. We demonstrate that the magnitude of effects caused by noise in the Named Entity Recognition phase is dependent on both model complexity and noise ratio, while Entity Disambiguation components are susceptible to entity boundary imprecision due to strong vocabulary dependency.
Anthology ID:
2023.acl-long.458
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8238–8253
Language:
URL:
https://aclanthology.org/2023.acl-long.458
DOI:
10.18653/v1/2023.acl-long.458
Bibkey:
Cite (ACL):
Gabriel Herman Bernardim Andrade, Shuntaro Yada, and Eiji Aramaki. 2023. Comparative evaluation of boundary-relaxed annotation for Entity Linking performance. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8238–8253, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Comparative evaluation of boundary-relaxed annotation for Entity Linking performance (Herman Bernardim Andrade et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2023.acl-long.458.pdf
Video:
 https://preview.aclanthology.org/landing_page/2023.acl-long.458.mp4