Sophie Repp

2026

In this paper, we investigate the factors that influence interactants’ choice of syntactic variants when using a predicative adjective construction like that is okay. In German colloquial conversation, such constructions can occur as full sentences with subject, copula and adjective (das ist gut ‘that is good’); with topic drop consisting of copula and adjective (ist gut); or as fragments consisting only of the adjective (gut). We present findings from a corpus of colloquial speech between fellow students showing that the interactional function of listener feedback has a higher predictive power in accounting for the use of fragments vs. fuller structures than adjective semantics (descriptive vs. evaluative), propositional structure (reference to individual or propositionally structured referent), and predictability in terms of adjective frequency. Moreover, we find that fragments consisting of evaluative adjectives show a clear tendency to be grounded in the here-and-now of the current situation, whereas fuller structures are more apt to express evaluations grounded in past experience. We argue that fragments are formally optimized to convey expressive actions such as listener feedback and other ad-hoc evaluations.

pdf bib abs

Object Realisation in Spoken Guadeloupan French: Evaluating NLP Models for an Under-Resourced Variety
Amalia Canes Nápoles | Sophie Repp
Proceedings of the Fifteenth Language Resources and Evaluation Conference

This paper contributes to the evaluation of natural language parsing models applied to colloquial speech in lesser studied varieties of a language. We are reporting on the performance of speech recognition and of universal dependency (UD) parsing models in a radio corpus of colloquial French spoken in Guadaloupe (GuaFr), which is in contact with a typologically distant language, French-based Guadaloupean Creole (GuaCr). The corpus poses specific challenges due to phonetic and syntactic specifics of GuaFr, as well as the occurrence of code switching to GuaCr. We show weakening the ASR decoder’s language-model (LM) in various parameters avoids hallucination of null objects, which have been described as typical for spoken GuaFr, but not of non-standard object clitic positioning. For UD parsing, we investigate utterance segmentation as the primary lever to affect model performance and compare different segmentation sources (ASR punctuation, manual chunking, UD parser tokenization) and their combination. We highlight both strengths and pitfalls of the models, again focussing on the expression of syntactic objects.

Co-authors

Petra B. Schumacher 1

Venues

DND1
LREC1

Fix author