Renate Delucchi Danhier
2025
ExpLay: A new Corpus Resource for the Research on Expertise as an Influential Factor on Language Production
Carmen Schacht
|
Renate Delucchi Danhier
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)
This paper introduces the ExpLay-Pipeline, a novel semi-automated processing tool designed for the analysis of language production data from experts in comparison to the language production of a control group of laypeople. The pipeline combines manual annotation and curation with state-of-the-art machine learning and rule-based methods, following a silver standard approach. It integrates various analysis modules specifically for the syntactic and lexical evaluation of parsed linguistic data. While implemented initially for the creation of the ExpLay-Corpus, it is designed for the processing of linguistic data in general. The paper details the design and implementation of this pipeline.