Mats Wirén

Also published as: Mats Wiren


2024

pdf
Evaluation of Really Good Grammatical Error Correction
Robert Östling | Katarina Gillholm | Murathan Kurfalı | Marie Mattson | Mats Wirén
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Traditional evaluation methods for Grammatical Error Correction (GEC) fail to fully capture the full range of system capabilities and objectives. The emergence of large language models (LLMs) has further highlighted the shortcomings of these evaluation strategies, emphasizing the need for a paradigm shift in evaluation methodology. In the current study, we perform a comprehensive evaluation of various GEC systems using a recently published dataset of Swedish learner texts. The evaluation is performed using established evaluation metrics as well as human judges. We find that GPT-3 in a few-shot setting by far outperforms previous grammatical error correction systems for Swedish, a language comprising only about 0.1% of its training data. We also found that current evaluation methods contain undesirable biases that a human evaluation is able to reveal. We suggest using human post-editing of GEC system outputs to analyze the amount of change required to reach native-level human performance on the task, and provide a dataset annotated with human post-edits and assessments of grammaticality, fluency and meaning preservation of GEC system outputs.

2020

pdf
A Multi-word Expression Dataset for Swedish
Murathan Kurfalı | Robert Östling | Johan Sjons | Mats Wirén
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present a new set of 96 Swedish multi-word expressions annotated with degree of (non-)compositionality. In contrast to most previous compositionality datasets we also consider syntactically complex constructions and publish a formal specification of each expression. This allows evaluation of computational models beyond word bigrams, which have so far been the norm. Finally, we use the annotations to evaluate a system for automatic compositionality estimation based on distributional semantics. Our analysis of the disagreements between human annotators and the distributional model reveal interesting questions related to the perception of compositionality, and should be informative to future work in the area.

pdf
Zero-shot cross-lingual identification of direct speech using distant supervision
Murathan Kurfalı | Mats Wirén
Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Prose fiction typically consists of passages alternating between the narrator’s telling of the story and the characters’ direct speech in that story. Detecting direct speech is crucial for the downstream analysis of narrative structure, and may seem easy at first thanks to quotation marks. However, typographical conventions vary across languages, and as a result, almost all approaches to this problem have been monolingual. In contrast, the aim of this paper is to provide a multilingual method for identifying direct speech. To this end, we created a training corpus by using a set of heuristics to automatically find texts where quotation marks appear sufficiently consistently. We then removed the quotation marks and developed a sequence classifier based on multilingual-BERT which classifies each token as belonging to narration or speech. Crucially, by training the classifier with the quotation marks removed, it was forced to learn the linguistic characteristics of direct speech rather than the typography of quotation marks. The results in the zero-shot setting of the proposed model are comparable to the strong supervised baselines, indicating that this is a feasible approach.

2018

pdf
Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction
Adam Ek | Mats Wirén | Robert Östling | Kristina N. Björkenstam | Gintarė Grigonytė | Sofia Gustafson Capková
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf
Learner Corpus Anonymization in the Age of GDPR: Insights from the Creation of a Learner Corpus of Swedish
Beáta Megyesi | Lena Granstedt | Sofia Johansson | Julia Prentice | Dan Rosén | Carl-Johan Schenström | Gunlög Sundberg | Mats Wirén | Elena Volodina
Proceedings of the 7th workshop on NLP for Computer Assisted Language Learning

2017

pdf
Universal Dependencies for Swedish Sign Language
Robert Östling | Carl Börstell | Moa Gärdenfors | Mats Wirén
Proceedings of the 21st Nordic Conference on Computational Linguistics

2016

pdf
Longitudinal Studies of Variation Sets in Child-directed Speech
Mats Wirén | Kristina Nilsson Björkenstam | Gintarė Grigonytė | Elisabet Eir Cortes
Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning

pdf
Modelling the informativeness and timing of non-verbal cues in parent-child interaction
Kristina Nilsson Björkenstam | Mats Wirén | Robert Östling
Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning

2014

pdf
Improving Readability of Swedish Electronic Health Records through Lexical Simplification: First Results
Gintarė Grigonyte | Maria Kvist | Sumithra Velupillai | Mats Wirén
Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)

2013

pdf
Book Review:
Mats Wirén
Computational Linguistics, Volume 39, Issue 3 - September 2013

2007

pdf
Experiences of an In-Service Wizard-of-Oz Data Collection for the Deployment of a Call-Routing Application
Mats Wirén | Robert Eklund
Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies

pdf
Multi-slot semantics for natural-language call routing systems
Johan Boye | Mats Wirén
Proceedings of the Workshop on Bridging the Gap: Academic and Industrial Research in Dialog Technologies

2004

pdf
The NICE Fairy-tale Game System
Joakim Gustafson | Linda Bell | Johan Boye | Anders Lindström | Mats Wirén
Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue at HLT-NAACL 2004

1997

pdf
Translation Methodology in the Spoken Language Translator: An Evaluation
David Carter | Ralph Becket | Manny Rayner | Robert Eklund | Catriona MacDermid | Mats Wirén | Sabine Kirchmeier-Andersen | Christina Philp
Spoken Language Translation

pdf
Recycling Lingware in a Multilingual MT System
Manny Rayner | David Carter | Ivan Bretan | Robert Eklund | Mats Wiren | Steffen Leo Hansen | Sabine Kirchmeier-Andersen | Christina Philp | Finn Sorensen | Hanne Erdman Thomsen
From Research to Commercial Applications: Making NLP Work in Practice

1994

pdf
Minimal Change and Bounded Incremental Parsing
Mats Wiren
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

1990

pdf
Incremental Parsing and Reason Maintenance
Mats Wiren
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics

1989

pdf
Interactive Incremental Chart Parsing
Mats Wiren
Fourth Conference of the European Chapter of the Association for Computational Linguistics

1987

pdf
A Comparison of Rule-Invocation Strategies in Context-Free Chart Parsing
Mats Wiren
Third Conference of the European Chapter of the Association for Computational Linguistics