2024
pdf
abs
ChiCA: un corpus de conversations face-à-face vs. Zoom entre enfants et parents
Dhia Elhak Goumri
|
Abhishek Agrawal
|
Mitja Nikolaus
|
Hong Duc Thang Vu
|
Kübra Bodur
|
Elias Semmar
|
Cassandre Armand
|
Chiara Mazzocconi
|
Shreejata Gupta
|
Laurent Prévot
|
Benoit Favre
|
Leonor Becerra-Bonache
|
Abdellah Fourtassi
Actes de la 31ème Conférence sur le Traitement Automatique des Langues Naturelles, volume 2 : traductions d'articles publiès
Les études existantes sur la parole en interaction naturelle se sont principalement concentrées sur les deux extrémités du spectre développemental, c’est-à-dire la petite enfance et l’âge adulte, laissant un vide dans nos connaissances sur la manière dont se déroule le développement, en particulier pendant l’age scolaire (6 à 11 ans). Le travail actuel contribue à combler cette lacune en introduisant un corpus développemental de conversations entre enfants et parents à domicile, impliquant des groupes d’enfants âgés de 7, 9 et 11 ans dont la langue maternelle est le français. Chaque dyade a été enregistrée deux fois: une fois en face-à-face et une fois en utilisant des appels vidéo par ordinateur. Pour les paramètres en face-à-face, nous avons capitalisé sur les progrès récents en matière de technologie de suivi oculaire mobile et de détection des mouvements de la tête pour optimiser le caractère naturel des enregistrements, nous permettant d’obtenir à la fois des données précises et écologiquement valides. De plus, nous avons contourné les difficultés de l’annotation manuelle en nous appuyant, dans la mesure du possible, sur des outils automatiques de traitement de la parole et de vision par ordinateur. Enfin, pour démontrer la richesse de ce corpus pour l’étude du développement communicatif de l’enfant, nous fournissons des analyses préliminaires comparant plusieurs mesures de la dynamique conversationnelle entre l’enfant et le parent selon l’âge, la modalité et le support communicatif. Nous espérons que le travail actuel ouvrira la voie à de futures découvertes sur les propriétés et les mécanismes du développement communicatif multimodal pendant l’age scolaire de l’enfant.
pdf
abs
Automatic Annotation of Grammaticality in Child-Caregiver Conversations
Mitja Nikolaus
|
Abhishek Agrawal
|
Petros Kaklamanis
|
Alex Warstadt
|
Abdellah Fourtassi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
The acquisition of grammar has been a central question to adjudicate between theories of language acquisition. In order to conduct faster, more reproducible, and larger-scale corpus studies on grammaticality in child-caregiver conversations, tools for automatic annotation can offer an effective alternative to tedious manual annotation. We propose a coding scheme for context-dependent grammaticality in child-caregiver conversations and annotate more than 4,000 utterances from a large corpus of transcribed conversations. Based on these annotations, we train and evaluate a range of NLP models. Our results show that fine-tuned Transformer-based models perform best, achieving human inter-annotation agreement levels. As a first application and sanity check of this tool, we use the trained models to annotate a corpus almost two orders of magnitude larger than the manually annotated data and verify that children’s grammaticality shows a steady increase with age. This work contributes to the growing literature on applying state-of-the-art NLP methods to help study child language acquisition at scale.
pdf
abs
Automatic Coding of Contingency in Child-Caregiver Conversations
Abhishek Agrawal
|
Mitja Nikolaus
|
Benoit Favre
|
Abdellah Fourtassi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
One of the most important communicative skills children have to learn is to engage in meaningful conversations with people around them. At the heart of this learning lies the mastery of contingency, i.e., the ability to contribute to an ongoing exchange in a relevant fashion (e.g., by staying on topic). Current research on this question relies on the manual annotation of a small sample of children, which limits our ability to draw general conclusions about development. Here, we propose to mitigate the limitations of manual labor by relying on automatic tools for contingency judgment in children’s early natural interactions with caregivers. Drawing inspiration from the field of dialogue systems evaluation, we built and compared several automatic classifiers. We found that a Transformer-based pre-trained language model – when fine-tuned on a relatively small set of data we annotated manually (around 3,500 turns) – provided the best predictions. We used this model to automatically annotate, new and large-scale data, almost two orders of magnitude larger than our fine-tuning set. It was able to replicate existing results and generate new data-driven hypotheses. The broad impact of the work is to provide resources that can help the language development community study communicative development at scale, leading to more robust theories.
pdf
abs
CHICA: A Developmental Corpus of Child-Caregiver’s Face-to-face vs. Video Call Conversations in Middle Childhood
Dhia Elhak Goumri
|
Abhishek Agrawal
|
Mitja Nikolaus
|
Hong Duc Thang Vu
|
Kübra Bodur
|
Elias Emmar
|
Cassandre Armand
|
Chiara Mazzocconi
|
Shreejata Gupta
|
Laurent Prévot
|
Benoit Favre
|
Leonor Becerra-Bonache
|
Abdellah Fourtassi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Existing studies of naturally occurring language-in-interaction have largely focused on the two ends of the developmental spectrum, i.e., early childhood and adulthood, leaving a gap in our knowledge about how development unfolds, especially across middle childhood. The current work contributes to filling this gap by introducing CHICA (for Child Interpersonal Communication Analysis), a developmental corpus of child-caregiver conversations at home, involving groups of French-speaking children aged 7, 9, and 11 years old. Each dyad was recorded twice: once in a face-to-face setting and once using computer-mediated video calls. For the face-to-face settings, we capitalized on recent advances in mobile, lightweight eye-tracking and head motion detection technology to optimize the naturalness of the recordings, allowing us to obtain both precise and ecologically valid data. Further, we mitigated the challenges of manual annotation by relying – to the extent possible – on automatic tools in speech processing and computer vision. Finally, to demonstrate the richness of this corpus for the study of child communicative development, we provide preliminary analyses comparing several measures of child-caregiver conversational dynamics across developmental age, modality, and communicative medium. We hope the current corpus will allow new discoveries into the properties and mechanisms of multimodal communicative development across middle childhood.
2020
pdf
bib
abs
Eyes on the Parse: Using Gaze Features in Syntactic Parsing
Abhishek Agrawal
|
Rudolf Rosa
Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
In this paper, we explore the potential benefits of leveraging eye-tracking information for dependency parsing on the English part of the Dundee corpus. To achieve this, we cast dependency parsing as a sequence labelling task and then augment the neural model for sequence labelling with eye-tracking features. We also augment a graph-based parser with eye-tracking features and parse the Dundee Corpus to corroborate our findings from the sequence labelling parser. We then experiment with a variety of parser setups ranging from parsing with all features to a delexicalized parser. Our experiments show that for a parser with all features, although the improvements are positive for the LAS score they are not significant whereas our delexicalized parser significantly outperforms the baseline we established. We also analyze the contribution of various eye-tracking features towards the different parser setups and find that eye-tracking features contain information which is complementary in nature, thus implying that augmenting the parser with various gaze features grouped together provides better performance than any individual gaze feature.