Christoph Draxler

2026

A Semi-Automatic Workflow for Transcribing and Annotating Broadcast News
Christoph Draxler | Sven Grawunder | Jürgen Trouvain | Felicitas Kleber
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Audio data archived in radio broadcast stations represent a rich source for various research purposes from phonetic questions up to training and test data for speech modelling. We present an efficient semi-automatic workflow for pre-processing, transcribing and analysing large linguistic-phonetic audio corpora. As a pilot study, we process radio broadcast news from a German public radio station containing recordings from 1956 until 2017. The workflow consists of basic preprocessing, automatic speech recognition, manual word correction, automatic generation of pairs of audio chunks and transcripts, plus an automatic word-, syllable- and phoneme-level segmentation of these chunks. The workflow is organised using the Octra Backend management tool, manual validation and correction of transcripts and chunking are performed using the Octra editor, and the BAS web services perform the segmentation. In an example analysis we show with our specific radio corpus how to use it for comparative longitudinal structure analyses of broadcast news, and for text- and signal-based studies on changes of speech and articulation rate.

pdf bib abs

Text+ is the German distributed research data infrastructure for literary studies, linguistics, and spoken and written language. Its resources consist of contemporary and historical literary and media texts, deeply annotated material, transcripts of spoken and sign language, and original recordings. Text+ provides access to its resources according to the FAIR guidelines: Findable due to standard-conformant metadata, Accessible with single sign-on authentication, Interoperable via open data formats, and Reproducible through web services and extensive documentation. The 30+ partners of Text+ are archives, libraries, universities, and other research institutions. The partners are autonomous, and they differ in the amount of data and processing capabilities they provide. In this paper, we describe the hub architecture of Text+, which gives users a central and FAIR point of access to research data that continues to be distributed across the Text+ partner institutions. The architecture serves as a blueprint to evolving research infrastructures that aim at maintaining (and empowering) their research data contributors.

2024

pdf bib abs

Speech Technology Services for Oral History Research
Christoph Draxler | Henk van den Heuvel | Arjan van Hessen | Pavel Ircing | Jan Lehečka
Proceedings of the First Workshop on Holocaust Testimonies as Language Resources (HTRes) @ LREC-COLING 2024

Oral history is about oral sources of witnesses and commentors on historical events. Speech technology is an important instrument to process such recordings in order to obtain transcription and further enhancements to structure the oral account In this contribution we address the transcription portal and the webservices associated with speech processing at BAS, speech solutions developed at LINDAT, how to do it yourself with Whisper, remaining challenges, and future developments.

2020

pdf bib abs

Building a Time-Aligned Cross-Linguistic Reference Corpus from Language Documentation Data (DoReCo)
Ludger Paschen | François Delafontaine | Christoph Draxler | Susanne Fuchs | Matthew Stave | Frank Seifart
Proceedings of the Twelfth Language Resources and Evaluation Conference

Natural speech data on many languages have been collected by language documentation projects aiming to preserve lingustic and cultural traditions in audivisual records. These data hold great potential for large-scale cross-linguistic research into phonetics and language processing. Major obstacles to utilizing such data for typological studies include the non-homogenous nature of file formats and annotation conventions found both across and within archived collections. Moreover, time-aligned audio transcriptions are typically only available at the level of broad (multi-word) phrases but not at the word and segment levels. We report on solutions developed for these issues within the DoReCo (DOcumentation REference COrpus) project. DoReCo aims at providing time-aligned transcriptions for at least 50 collections of under-resourced languages. This paper gives a preliminary overview of the current state of the project and details our workflow, in particular standardization of formats and conventions, the addition of segmental alignments with WebMAUS, and DoReCo’s applicability for subsequent research programs. By making the data accessible to the scientific community, DoReCo is designed to bridge the gap between language documentation and linguistic inquiry.

pdf bib abs

A CLARIN Transcription Portal for Interview Data
Christoph Draxler | Henk van den Heuvel | Arjan van Hessen | Silvia Calamai | Louise Corti
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this paper we present a first version of a transcription portal for audio files based on automatic speech recognition (ASR) in various languages. The portal is implemented in the CLARIN resources research network and intended for use by non-technical scholars. We explain the background and interdisciplinary nature of interview data, the perks and quirks of using ASR for transcribing the audio in a research context, the dos and don’ts for optimal use of the portal, and future developments foreseen. The portal is promoted in a range of workshops, but there are a number of challenges that have to be met. These challenges concern privacy issues, ASR quality, and cost, amongst others.

2016

pdf bib abs

The BAS Speech Data Repository
Uwe Reichel | Florian Schiel | Thomas Kisler | Christoph Draxler | Nina Pörner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The BAS CLARIN speech data repository is introduced. At the current state it comprises 31 pre-dominantly German corpora of spoken language. It is compliant to the CLARIN-D as well as the OLAC requirements. This enables its embedding into several infrastructures. We give an overview over its structure, its implementation as well as the corpora it contains.

pdf bib abs

BAS Speech Science Web Services - an Update of Current Developments
Thomas Kisler | Uwe Reichel | Florian Schiel | Christoph Draxler | Bernhard Jackl | Nina Pörner
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In 2012 the Bavarian Archive for Speech Signals started providing some of its tools from the field of spoken language in the form of Software as a Service (SaaS). This means users access the processing functionality over a web browser and therefore do not have to install complex software packages on a local computer. Amongst others, these tools include segmentation & labeling, grapheme-to-phoneme conversion, text alignment, syllabification and metadata generation, where all but the last are available for a variety of languages. Since its creation the number of available services and the web interface have changed considerably. We give an overview and a detailed description of the system architecture, the available web services and their functionality. Furthermore, we show how the number of files processed over the system developed in the last four years.

2014

pdf bib abs

Online experiments with the Percy software framework - experiences and some early results
Christoph Draxler
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In early 2012 the online perception experiment software Percy was deployed on a production server at our lab. Since then, 38 experiments have been made publicly available, with a total of 3078 experiment sessions. In the course of time, the software has been continuously updated and extended to adapt to changing user requirements. Web-based editors for the structure and layout of the experiments have been developed. This paper describes the system architecture, presents usage statistics, discusses typical characteristics of online experiments, and gives an outlook on ongoing work. webapp.phonetik.uni-muenchen.de/WebExperiment lists all currently active experiments.

2008

pdf bib abs

F0 of Adolescent Speakers - First Results for the German Ph@ttSessionz Database
Christoph Draxler | Florian Schiel | Tania Ellbogen
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The first release of the German Ph@ttSessionz speech database contains read and spontaneous speech from 864 adolescent speakers and is the largest database of its kind for German. It was recorded via the WWW in over 40 public schools in all dialect regions of Germany. In this paper, we present a cross-sectional study of f0 measurements on this database. The study documents the profound changes in male voices at the age 13-15. Furthermore, it shows that on a perceptive mel-scale, there is little difference in the relative f0 variability for male and female speakers. A closer analysis reveals that f0 variability is dependent on the speech style and both the length and the type of the utterance. The study provides statistically reliable voice parameters of adolescent speakers for German. The results may contribute to making spoken dialog systems more robust by restricting user input to utterances with low f0 variability.

2006

pdf bib abs

Speech Recordings in Public Schools in Germany - the Perfect Show Case for Web-based Recordings and Annotation
Christoph Draxler | Klaus Jänsch
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In the Ph@ttSessionz project, geographically distributed high-bandwidth recordings of adolescent speakers are performed in public schools all over Germany. To achieve a consistent technical signal quality, a standard configuration of recording equipment is sent to the participating schools. The recordings are made using the SpeechRecorder software for prompted speech recordings via the WWW. During a recording session, prompts are downloaded from a server, and the speech data is uploaded to the server in a background process. This paper focuses on the technical aspects of the distributed Ph@ttSessionz speech recordings and their annotation.

2004

pdf bib

SpeechRecorder - a Universal Platform Independent Multi-Channel Audio Recording Software
Christoph Draxler | Klaus Jänsch
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2002

pdf bib

Three New Corpora at the Bavarian Archive for Speech Signals – and a First Step Towards Distributed Web-Based Recording
Christoph Draxler | Florian Schiel
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2000

pdf bib

Venues

Fix author