Knut Kvale


2025

pdf bib
Match ‘em: Multi-Tiered Alignment for Error Analysis in ASR
Phoebe Parsons | Knut Kvale | Torbjørn Svendsen | Giampiero Salvi
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)

We introduce “Match ‘em”: a new framework for aligning output from automatic speech recognition (ASR) with reference transcriptions. This allows a more detailed analysis of errors produced by end-to-end ASR systems compared to word error rate (WER). Match ‘em performs the alignment on both the word and character level; each relying on information from the other to provide the most meaningful global alignment. At the character level, we define a speech production motivated character similarity metric. At the word level, we rely on character similarities to define word similarity and, additionally, we reconcile compounding (insertion or deletion of spaces). We evaluated Match ‘em on transcripts of three European languages produced by wav2vec2 and Whisper. We show that Match ‘em results in more similar word substitution pairs and that compound reconciling can capture a broad range of spacing errors. We believe Match ‘em to be a valuable tool for ASR error analysis across many languages.

pdf bib
Adding Metadata to Existing Parliamentary Speech Corpus
Phoebe Parsons | Per Erik Solberg | Knut Kvale | Torbjørn Svendsen | Giampiero Salvi
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)

Parliamentary proceedings are convenient data sources for creating corpora for speech technology. Given its public nature, there is an abundance of extra information about the speakers that can be legally and ethically harvested to enrich this kind of corpora. This paper describes the methods we have used to add speaker metadata to the Stortinget Speech Corpus (SSC) containing over 5,000 hours of Norwegian speech with non-verbatim transcripts but without speaker metadata. The additional metadata for each speech segment includes speaker ID, gender, date of birth, municipality of birth, and counties represented. We also infer speaker dialect from their municipality of birth using a manually designed mapping between municipalities and Norwegian dialects. We provide observations on the SSC data and give suggestions for how it may be used for tasks other than speech recognition. Finally, we demonstrate the utility of this new metadata through a dialect identification task. The described methods can be adapted to add metadata information to parliamentary corpora in other languages.

2023

pdf bib
A character-based analysis of impacts of dialects on end-to-end Norwegian ASR
Phoebe Parsons | Knut Kvale | Torbjørn Svendsen | Giampiero Salvi
Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)

We present a method for analyzing character errors for use with character-based, end-to-end ASR systems, as used herein for investigating dialectal speech. As end-to-end systems are able to produce novel spellings, there exists a possibility that the spelling variants produced by these systems can capture phonological information beyond the intended target word. We therefore first introduce a way of guaranteeing that similar words and characters are paired during alignment, thus ensuring that any resulting analysis of character errors is founded on sound substitutions. Then, from such a careful character alignment, we find trends in system-generated spellings that align with known phonological features of Norwegian dialects, in particular, “r” and “l” confusability and voiceless stop lenition. Through this analysis, we demonstrate that cues from acoustic dialectal features can influence the output of an end-to-end ASR systems.

2006

pdf bib
KUNSTI - Knowledge Generation for Norwegian Language Technology
Bente Maegaard | Jens-Erik Fenstad | Lars Ahrenberg | Knut Kvale | Katarina Mühlenbock | Bernt-Erik Heid
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

KUNSTI is the Norwegian national language technology programme, running 2001-2006 inclusive. The goal of the programme is to boost Norwegian language technology research. In this paper we describe the background, the objectives, the methodology applied in the management of the programme, the projects selected, and our first conclusions. We also describe national programmes form Sweden, France and Germany and compare objectives and methods.