Michal Borský


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2022

pdf bib
Samrómur Children: An Icelandic Speech Corpus
Carlos Daniel Hernandez Mena | David Erik Mollberg | Michal Borský | Jón Guðnason
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Samrómur Children is an Icelandic speech corpus intended for the field of automatic speech recognition. It contains 131 hours of read speech from Icelandic children aged between 4 to 17 years. The test portion was meticulously selected to cover a wide range of ages as possible; we aimed to have exactly the same amount of data per age range. The speech was collected with the crowd-sourcing platform Samrómur.is, which is inspired on the “Mozilla’s Common Voice Project”. The corpus was developed within the framework of the “Language Technology Programme for Icelandic 2019 − 2023”; the goal of the project is to make Icelandic available in language-technology applications. Samrómur Children is the first corpus in Icelandic with children’s voices for public use under a Creative Commons license. Additionally, we present baseline experiments and results using Kaldi.