UD-CHILDES-BG: a dependency treebank of Bulgarian child and child-directed speech

Mila Marcheva-Nash, Yasena Chantova, Tsvetina Kirilova, Ivelina Pavlova, Tsvetelina Stefanova, Yoana Vasileva, Weiwei Sun


Abstract
This paper presents (i) UD-CHILDES-BG, a manually corrected Universal Dependencies treebank of Bulgarian child and child-directed speech, (ii) a quantitative and phenomenon-based evaluation of inter-annotator agreement on developmental data, and (iii) a systematic analysis of parser errors in this underrepresented domain. We manually correct 4,338 dependency parses (10% of the CHILDES-BG corpus), of which 14% are double-annotated. Inter-annotator agreement on UAS/LAS is 91.71/86.12 for child-directed speech (CDS) and 88.14/81.40 for child speech (CS). Parser performance on the manually corrected portion is 92.70/85.54 for CDS and 90.97/81.52 for CS, compared to a reported 93.37/90.21 on the test set of adult written language. Our analyses reveal that CDS and CS pose challenges for dependency annotation and parsing, particularly in discourse-related structures, which are less common in adult written language.
Anthology ID:
2026.law-main.9
Volume:
Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Yang Janet Liu, Luke Gessler
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
113–129
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.9/
DOI:
Bibkey:
Cite (ACL):
Mila Marcheva-Nash, Yasena Chantova, Tsvetina Kirilova, Ivelina Pavlova, Tsvetelina Stefanova, Yoana Vasileva, and Weiwei Sun. 2026. UD-CHILDES-BG: a dependency treebank of Bulgarian child and child-directed speech. In Proceedings of the 20th Linguistic Annotation Workshop (LAW XX), pages 113–129, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
UD-CHILDES-BG: a dependency treebank of Bulgarian child and child-directed speech (Marcheva-Nash et al., LAW 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.9.pdf