Computer, enhence: POS-tagging improvements for nonbinary pronoun use in Swedish

Henrik Björklund, Hannah Devinney


Abstract
Part of Speech (POS) taggers for Swedish routinely fail for the third person gender-neutral pronoun “hen”, despite the fact that it has been a well-established part of the Swedish language since at least 2014. In addition to simply being a form of gender bias, this failure can have negative effects on other tasks relying on POS information. We demonstrate the usefulness of semi-synthetic augmented datasets in a case study, retraining a POS tagger to correctly recognize “hen” as a personal pronoun. We evaluate our retrained models for both tag accuracy and on a downstream task (dependency parsing) in a classicial NLP pipeline. Our results show that adding such data works to correct for the disparity in performance. The accuracy rate for identifying “hen” as a pronoun can be brought up to acceptable levels with only minor adjustments to the tagger’s vocabulary files. Performance parity to gendered pronouns can be reached after retraining with only a few hundred examples. This increase in POS tag accuracy also results in improvements for dependency parsing sentences containing hen.
Anthology ID:
2023.ltedi-1.8
Volume:
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Bharathi R. Chakravarthi, B. Bharathi, Joephine Griffith, Kalika Bali, Paul Buitelaar
Venues:
LTEDI | WS
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
54–61
Language:
URL:
https://aclanthology.org/2023.ltedi-1.8
DOI:
Bibkey:
Cite (ACL):
Henrik Björklund and Hannah Devinney. 2023. Computer, enhence: POS-tagging improvements for nonbinary pronoun use in Swedish. In Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion, pages 54–61, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Computer, enhence: POS-tagging improvements for nonbinary pronoun use in Swedish (Björklund & Devinney, LTEDI-WS 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2023.ltedi-1.8.pdf