2025
pdf
bib
abs
À la poursuite de phrases: méthodes pour traiter des données dynamiques pour tracer la production de phrases
Malgorzata Anna Ulasik
|
Cerstin Mahlow
Actes de l'atelier Traitement de données langagières dynamiques par les outils et méthodes du TAL 2025 (DYN-TAL)
Nous présentons des méthodes de traitement des données dynamiques permettant de retracer le processus de production de phrases. En tant qu’activité incrémentielle et non linéaire, l’écriture produit des versions intermédiaires incomplètes ou mal formées qui évoluent au fil de fréquentes révisions. À l’aide d’outils d’enregistrement des frappes et de traitement du langage naturel (TALN), nous proposons un cadre permettant de reconstruire automatiquement l’historique des phrases. De plus, nous implémentons dans THEtool un modèle qui synchronise l’historique des phrases avec les événements de révision et les patterns de pause. Cette représentation multicouche facilite la compréhension détaillée des aspects cognitifs et linguistiques de la construction des phrases.
2020
pdf
bib
abs
Swiss-AL: A Multilingual Swiss Web Corpus for Applied Linguistics
Julia Krasselt
|
Philipp Dressen
|
Matthias Fluor
|
Cerstin Mahlow
|
Klaus Rothenhäusler
|
Maren Runte
Proceedings of the Twelfth Language Resources and Evaluation Conference
The Swiss Web Corpus for Applied Linguistics (Swiss-AL) is a multilingual (German, French, Italian) collection of texts from selected web sources. Unlike most other web corpora it is not intended for NLP purposes, but rather designed to support data-based and data-driven research on societal and political discourses in Switzerland. It currently contains 8 million texts (approx. 1.55 billion tokens), including news and specialist publications, governmental opinions, and parliamentary records, web sites of political parties, companies, and universities, statements from industry associations and NGOs, etc. A flexible processing pipeline using state-of-the-art components allows researchers in applied linguistics to create tailor-made subcorpora for studying discourse in a wide range of domains. So far, Swiss-AL has been used successfully in research on Swiss public discourses on energy and on antibiotic resistance.
2016
pdf
bib
abs
C-WEP―Rich Annotated Collection of Writing Errors by Professionals
Cerstin Mahlow
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper presents C-WEP, the Collection of Writing Errors by Professionals Writers of German. It currently consists of 245 sentences with grammatical errors. All sentences are taken from published texts. All authors are professional writers with high skill levels with respect to German, the genres, and the topics. The purpose of this collection is to provide seeds for more sophisticated writing support tools as only a very small proportion of those errors can be detected by state-of-the-art checkers. C-WEP is annotated on various levels and freely available.
2012
pdf
bib
Proceedings of the Second Workshop on Computational Linguistics and Writing (CL&W 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering
Michael Piotrowski
|
Cerstin Mahlow
|
Robert Dale
Proceedings of the Second Workshop on Computational Linguistics and Writing (CL&W 2012): Linguistic and Cognitive Aspects of Document Creation and Document Engineering
2010
pdf
bib
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
Michael Piotrowski
|
Cerstin Mahlow
|
Robert Dale
Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
2004
pdf
bib
Sentence Completion Tests for Training and Assessment in a Computational Linguistics Curriculum
Cerstin Mahlow
|
Michael Hess
Proceedings of the Workshop on eLearning for Computational Linguistics and Computational Linguistics for eLearning