UD-English-CHILDES: A Collected Resource of Gold and Silver Universal Dependencies Trees for Child Language Interactions
Xiulin Yang, Zhuoxuan Ju, Lanni Bu, Zoey Liu, Nathan Schneider
Abstract
CHILDES is a widely used resource of transcribed child and child-directed speech. This paper introduces UD-English-CHILDES, the first officially released Universal Dependencies (UD) treebank. It is derived from previously dependency-annotated CHILDES data, which we harmonize to follow unified annotation principles. The gold-standard trees encompass utterances sampled from 11 children and their caregivers, totaling over 48K sentences (236K tokens). We validate these gold-standard annotations under the UD v2 framework and provide an additional 1M silver-standard sentences, offering a consistent resource for computational and linguistic research.- Anthology ID:
- 2025.udw-1.6
- Volume:
- Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025)
- Month:
- August
- Year:
- 2025
- Address:
- Ljubljana, Slovenia
- Editors:
- Gosse Bomma, Çağrı Çöltekin
- Venues:
- UDW | WS | SyntaxFest
- SIG:
- SIGPARSE
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 52–58
- Language:
- URL:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.udw-1.6/
- DOI:
- Cite (ACL):
- Xiulin Yang, Zhuoxuan Ju, Lanni Bu, Zoey Liu, and Nathan Schneider. 2025. UD-English-CHILDES: A Collected Resource of Gold and Silver Universal Dependencies Trees for Child Language Interactions. In Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025), pages 52–58, Ljubljana, Slovenia. Association for Computational Linguistics.
- Cite (Informal):
- UD-English-CHILDES: A Collected Resource of Gold and Silver Universal Dependencies Trees for Child Language Interactions (Yang et al., UDW-SyntaxFest 2025)
- PDF:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.udw-1.6.pdf