ÚFAL CorPipe at CRAC 2023: Larger Context Improves Multilingual Coreference Resolution

Milan Straka


Abstract
We present CorPipe, the winning entry to the CRAC 2023 Shared Task on Multilingual Coreference Resolution. Our system is an improved version of our earlier multilingual coreference pipeline, and it surpasses other participants by a large margin of 4.5 percent points. CorPipe first performs mention detection, followed by coreference linking via an antecedent-maximization approach on the retrieved spans. Both tasks are trained jointly on all available corpora using a shared pretrained language model. Our main improvements comprise inputs larger than 512 subwords and changing the mention decoding to support ensembling. The source code is available at https://github.com/ufal/crac2023-corpipe.
Anthology ID:
2023.crac-sharedtask.4
Volume:
Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution
Month:
December
Year:
2023
Address:
Singapore
Editors:
Zdeněk Žabokrtský, Maciej Ogrodniczuk
Venues:
CRAC | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
41–51
Language:
URL:
https://aclanthology.org/2023.crac-sharedtask.4
DOI:
10.18653/v1/2023.crac-sharedtask.4
Bibkey:
Cite (ACL):
Milan Straka. 2023. ÚFAL CorPipe at CRAC 2023: Larger Context Improves Multilingual Coreference Resolution. In Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution, pages 41–51, Singapore. Association for Computational Linguistics.
Cite (Informal):
ÚFAL CorPipe at CRAC 2023: Larger Context Improves Multilingual Coreference Resolution (Straka, CRAC-WS 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2023.crac-sharedtask.4.pdf
Video:
 https://preview.aclanthology.org/naacl24-info/2023.crac-sharedtask.4.mp4