Pre-annotation Matters: A Comparative Study on POS and Dependency Annotation for an Alsatian Dialect

Delphine Bernhard, Nathanaël Beiner, Barbara Hoff


Abstract
The annotation of corpora for lower-resource languages can benefit from automatic pre-annotation to increase the throughput of the annotation process in a a context where human resources are scarce. However, this can be hindered by the lack of available pre-annotation tools. In this work, we compare three pre-annotation methods in zero-shot or near-zero-shot contexts for part-of-speech (POS) and dependency annotation of an Alsatian Alemannic dialect. Our study shows that good levels of annotation quality can be achieved, with human annotators adapting their correction effort to the perceived quality of the pre-annotation. The pre-annotation tools also vary in efficiency depending on the task, with better global results for a system trained on closely related languages and dialects.
Anthology ID:
2025.law-1.14
Volume:
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Siyao Peng, Ines Rehbein
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
173–186
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.law-1.14/
DOI:
10.18653/v1/2025.law-1.14
Bibkey:
Cite (ACL):
Delphine Bernhard, Nathanaël Beiner, and Barbara Hoff. 2025. Pre-annotation Matters: A Comparative Study on POS and Dependency Annotation for an Alsatian Dialect. In Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025), pages 173–186, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Pre-annotation Matters: A Comparative Study on POS and Dependency Annotation for an Alsatian Dialect (Bernhard et al., LAW 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.law-1.14.pdf