Abstract
The goal of this work is to introduce CHILDES-MWE, which contains English CHILDES corpora automatically annotated with Multiword Expressions (MWEs) information. The result is a resource with almost 350,000 sentences annotated with more than 70,000 distinct MWEs of various types from both longitudinal and latitudinal corpora. This resource can be used for large scale language acquisition studies of how MWEs feature in child language. Focusing on compound nouns (CN), we then verify in a longitudinal study if there are differences in the distribution and compositionality of CNs in child-directed and child-produced sentences across ages. Moreover, using additional latitudinal data, we investigate if there are further differences in CN usage and in compositionality preferences. The results obtained for the child-produced sentences reflect CN distribution and compositionality in child-directed sentences.- Anthology ID:
- L16-1365
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 2307–2311
- Language:
- URL:
- https://aclanthology.org/L16-1365
- DOI:
- Cite (ACL):
- Rodrigo Wilkens, Marco Idiart, and Aline Villavicencio. 2016. Multiword Expressions in Child Language. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2307–2311, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Multiword Expressions in Child Language (Wilkens et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/L16-1365.pdf