Ronja Laarmann-Quante

Also published as: Ronja Laarmann-quante

2024

pdf abs
Automatic Extraction of Nominal Phrases from German Learner Texts of Different Proficiency Levels
Ronja Laarmann-Quante | Marco Müller | Eva Belke
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Correctly inflecting determiners and adjectives so that they agree with the noun in nominal phrases (NPs) is a big challenge for learners of German. Given the increasing number of available learner corpora, a large-scale corpus-based study on the acquisition of this aspect of German morphosyntax would be desirable. In this paper, we present a pilot study in which we investigate how well nouns, their grammatical heads and the dependents that have to agree with the noun can be extracted automatically via dependency parsing. For six samples of the German learner corpus MERLIN (one per proficiency level), we found that in spite of many ungrammatical sentences in texts of low proficiency levels, human annotators find only few true ambiguities that would make the extraction of NPs and their heads infeasible. The automatic parsers, however, perform rather poorly on extracting the relevant elements for texts on CEFR levels A1-B1 (< 70%) but quite well from level B2 onwards ( 90%). We discuss the sources of errors and how performance could potentially be increased in the future.

pdf bib
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
Ekaterina Kochmar | Marie Bexte | Jill Burstein | Andrea Horbach | Ronja Laarmann-Quante | Anaïs Tack | Victoria Yaneva | Zheng Yuan
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

2023

pdf abs
Preserving the Authenticity of Handwritten Learner Language: Annotation Guidelines for Creating Transcripts Retaining Orthographic Features
Christian Gold | Ronja Laarmann-quante | Torsten Zesch
Proceedings of the Workshop on Computation and Written Language (CAWL 2023)

Handwritten texts produced by young learners often contain orthographic features like spelling errors, capitalization errors, punctuation errors, and impurities such as strikethroughs, inserts, and smudges. All of those are typically normalized or ignored in existing transcriptions. For applications like handwriting recognition with the goal of automatically analyzing a learner’s language performance, however, retaining such features would be necessary. To address this, we present transcription guidelines that retain the features addressed above. Our guidelines were developed iteratively and include numerous example images to illustrate the various issues. On a subset of about 90 double-transcribed texts, we compute inter-annotator agreement and show that our guidelines can be applied with high levels of percentage agreement of about .98. Overall, we transcribed 1,350 learner texts, which is about the same size as the widely adopted handwriting recognition datasets IAM (1,500 pages) and CVL (1,600 pages). Our final corpus can be used to train a handwriting recognition system that transcribes closely to the real productions by young learners. Such a system is a prerequisite for applying automatic orthography feedback systems to handwritten texts in the future.

pdf abs
Recognizing Learner Handwriting Retaining Orthographic Errors for Enabling Fine-Grained Error Feedback
Christian Gold | Ronja Laarmann-Quante | Torsten Zesch
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

This paper addresses the problem of providing automatic feedback on orthographic errors in handwritten text. Despite the availability of automatic error detection systems, the practical problem of digitizing the handwriting remains. Current handwriting recognition (HWR) systems produce highly accurate transcriptions but normalize away the very errors that are essential for providing useful feedback, e.g. orthographic errors. Our contribution is twofold:First, we create a comprehensive dataset of handwritten text with transcripts retaining orthographic errors by transcribing 1,350 pages from the German learner dataset FD-LEX. Second, we train a simple HWR system on our dataset, allowing it to transcribe words with orthographic errors. Thereby, we evaluate the effect of different dictionaries on recognition output, highlighting the importance of addressing spelling errors in these dictionaries.

pdf
Manual and Automatic Identification of Similar Arguments in EFL Learner Essays
Ahmed Mousa | Ronja Laarmann-Quante | Andrea Horbach
Proceedings of the 12th Workshop on NLP for Computer Assisted Language Learning

2022

pdf abs
‘Meet me at the ribary’ – Acceptability of spelling variants in free-text answers to listening comprehension prompts
Ronja Laarmann-Quante | Leska Schwarz | Andrea Horbach | Torsten Zesch
Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022)

When listening comprehension is tested as a free-text production task, a challenge for scoring the answers is the resulting wide range of spelling variants. When judging whether a variant is acceptable or not, human raters perform a complex holistic decision. In this paper, we present a corpus study in which we analyze human acceptability decisions in a high stakes test for German. We show that for human experts, spelling variants are harder to score consistently than other answer variants. Furthermore, we examine how the decision can be operationalized using features that could be applied by an automatic scoring system. We show that simple measures like edit distance and phonetic similarity between a given answer and the target answer can model the human acceptability decisions with the same inter-annotator agreement as humans, and discuss implications of the remaining inconsistencies.

pdf
Bringing Automatic Scoring into the Classroom - Measuring the Impact of Automated Analytic Feedback on Student Writing Performance
Andrea Horbach | Ronja Laarmann-Quante | Lucas Liebenow | Thorben Jansen | Stefan Keller | Jennifer Meyer | Torsten Zesch | Johanna Fleckenstein
Proceedings of the 11th Workshop on NLP for Computer Assisted Language Learning

pdf
Evaluating Automatic Spelling Correction Tools on German Primary School Children’s Misspellings
Ronja Laarmann-Quante | Lisa Prepens | Torsten Zesch
Proceedings of the 11th Workshop on NLP for Computer Assisted Language Learning

pdf abs
LeSpell - A Multi-Lingual Benchmark Corpus of Spelling Errors to Develop Spellchecking Methods for Learner Language
Marie Bexte | Ronja Laarmann-Quante | Andrea Horbach | Torsten Zesch
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Spellchecking text written by language learners is especially challenging because errors made by learners differ both quantitatively and qualitatively from errors made by already proficient learners. We introduce LeSpell, a multi-lingual (English, German, Italian, and Czech) evaluation data set of spelling mistakes in context that we compiled from seven underlying learner corpora. Our experiments show that existing spellcheckers do not work well with learner data. Thus, we introduce a highly customizable spellchecking component for the DKPro architecture, which improves performance in many settings.

2021

2019

pdf abs
The making of the Litkey Corpus, a richly annotated longitudinal corpus of German texts written by primary school children
Ronja Laarmann-Quante | Stefanie Dipper | Eva Belke
Proceedings of the 13th Linguistic Annotation Workshop

To date, corpus and computational linguistic work on written language acquisition has mostly dealt with second language learners who have usually already mastered orthography acquisition in their first language. In this paper, we present the Litkey Corpus, a richly-annotated longitudinal corpus of written texts produced by primary school children in Germany from grades 2 to 4. The paper focuses on the (semi-)automatic annotation procedure at various linguistic levels, which include POS tags, features of the word-internal structure (phonemes, syllables, morphemes) and key orthographic features of the target words as well as a categorization of spelling errors. Comprehensive evaluations show that high accuracy was achieved on all levels, making the Litkey Corpus a useful resource for corpus-based research on literacy acquisition of German primary school children and for developing NLP tools for educational purposes. The corpus is freely available under https://www.linguistics.rub.de/litkeycorpus/.

2017

pdf abs
Annotating Orthographic Target Hypotheses in a German L1 Learner Corpus
Ronja Laarmann-Quante | Katrin Ortmann | Anna Ehlert | Maurice Vogel | Stefanie Dipper
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

NLP applications for learners often rely on annotated learner corpora. Thereby, it is important that the annotations are both meaningful for the task, and consistent and reliable. We present a new longitudinal L1 learner corpus for German (handwritten texts collected in grade 2–4), which is transcribed and annotated with a target hypothesis that strictly only corrects orthographic errors, and is thereby tailored to research and tool development for orthographic issues in primary school. While for most corpora, transcription and target hypothesis are not evaluated, we conducted a detailed inter-annotator agreement study for both tasks. Although we achieved high agreement, our discussion of cases of disagreement shows that even with detailed guidelines, annotators differ here and there for different reasons, which should also be considered when working with transcriptions and target hypotheses of other corpora, especially if no explicit guidelines for their construction are known.

2016

pdf
Annotating Spelling Errors in German Texts Produced by Primary School Children
Ronja Laarmann-Quante | Lukas Knichel | Stefanie Dipper | Carina Betken
Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016)

Co-authors

Venues

bea7
nlp4call3
law2
lrec2
cawl1
show all...

coling1