Correcting Sense Annotations Using Wordnets and Translations

Arnob Mallik, Grzegorz Kondrak


Abstract
Acquiring large amounts of high-quality annotated data is an open issue in word sense disambiguation. This problem has become more critical recently with the advent of supervised models based on neural networks, which require large amounts of annotated data. We propose two algorithms for making selective corrections on a sense-annotated parallel corpus, based on cross-lingual synset mappings. We show that, when applied to bilingual parallel corpora, these algorithms can rectify noisy sense annotations, and thereby produce multilingual sense-annotated data of improved quality.
Anthology ID:
2023.gwc-1.33
Volume:
Proceedings of the 12th Global Wordnet Conference
Month:
January
Year:
2023
Address:
University of the Basque Country, Donostia - San Sebastian, Basque Country
Editors:
German Rigau, Francis Bond, Alexandre Rademaker
Venue:
GWC
SIG:
Publisher:
Global Wordnet Association
Note:
Pages:
269–273
Language:
URL:
https://aclanthology.org/2023.gwc-1.33
DOI:
Bibkey:
Cite (ACL):
Arnob Mallik and Grzegorz Kondrak. 2023. Correcting Sense Annotations Using Wordnets and Translations. In Proceedings of the 12th Global Wordnet Conference, pages 269–273, University of the Basque Country, Donostia - San Sebastian, Basque Country. Global Wordnet Association.
Cite (Informal):
Correcting Sense Annotations Using Wordnets and Translations (Mallik & Kondrak, GWC 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.gwc-1.33.pdf