TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech

Shamira Venturini, Oliver Hennhöfer, Steffen Kinkel, Jannik Strötgen


Abstract
Fine-grained morphosyntactic error annotation is important in clinical and developmental language research, yet it is labour-intensive, expert-dependent, and difficult to scale. We present TalkTag, an LLM-based lightweight tool fine-tuned to automate CHAT-style error annotation in spoken-language transcripts. Developed under conditions of extreme data scarcity using children’s narrative data, the system shows the feasibility of linguistic analysis in low-resource settings. Our evaluation demonstrates that TalkTag produces encouragingly precise annotation while effectively identifying instances where linguistic ambiguity makes automated tagging genuinely complex. In summary, with TalkTag, we provide a scalable alternative to manual error annotation and practically viable support for morphosyntactic error annotation.
Anthology ID:
2026.law-main.20
Volume:
Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yang Janet Liu, Luke Gessler
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
309–322
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.20/
DOI:
Bibkey:
Cite (ACL):
Shamira Venturini, Oliver Hennhöfer, Steffen Kinkel, and Jannik Strötgen. 2026. TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech. In Proceedings of the 20th Linguistic Annotation Workshop (LAW XX), pages 309–322, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech (Venturini et al., LAW 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.20.pdf