Raphael Schwitter

2024

pdf
LLM-based Translation Across 500 Years. The Case for Early New High German
Martin Volk | Dominic P. Fischer | Patricia Scheurer | Raphael Schwitter | Phillip B. Ströbel
Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024)

2022

pdf abs
Nunc profana tractemus. Detecting Code-Switching in a Large Corpus of 16th Century Letters
Martin Volk | Lukas Fischer | Patricia Scheurer | Bernard Silvan Schroffenegger | Raphael Schwitter | Phillip Ströbel | Benjamin Suter
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper is based on a collection of 16th century letters from and to the Zurich reformer Heinrich Bullinger. Around 12,000 letters of this exchange have been preserved, out of which 3100 have been professionally edited, and another 5500 are available as provisional transcriptions. We have investigated code-switching in these 8600 letters, first on the sentence-level and then on the word-level. In this paper we give an overview of the corpus and its language mix (mostly Early New High German and Latin, but also French, Greek, Italian and Hebrew). We report on our experiences with a popular language identifier and present our results when training an alternative identifier on a very small training corpus of only 150 sentences per language. We use the automatically labeled sentences in order to bootstrap a word-based language classifier which works with high accuracy. Our research around the corpus building and annotation involves automatic handwritten text recognition, text normalisation for ENH German, and machine translation from medieval Latin into modern German.

The evaluation of Handwritten Text Recognition (HTR) models during their development is straightforward: because HTR is a supervised problem, the usual data split into training, validation, and test data sets allows the evaluation of models in terms of accuracy or error rates. However, the evaluation process becomes tricky as soon as we switch from development to application. A compilation of a new (and forcibly smaller) ground truth (GT) from a sample of the data that we want to apply the model on and the subsequent evaluation of models thereon only provides hints about the quality of the recognised text, as do confidence scores (if available) the models return. Moreover, if we have several models at hand, we face a model selection problem since we want to obtain the best possible result during the application phase. This calls for GT-free metrics to select the best model, which is why we (re-)introduce and compare different metrics, from simple, lexicon-based to more elaborate ones using standard language models and masked language models (MLM). We show that MLM-based evaluation can compete with lexicon-based methods, with the advantage that large and multilingual transformers are readily available, thus making compiling lexical resources for other metrics superfluous.

pdf abs
Machine Translation of 16Th Century Letters from Latin to German
Lukas Fischer | Patricia Scheurer | Raphael Schwitter | Martin Volk
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages

This paper outlines our work in collecting training data for and developing a Latin–German Neural Machine Translation (NMT) system, for translating 16th century letters. While Latin–German is a low-resource language pair in terms of NMT, the domain of 16th century epistolary Latin is even more limited in this regard. Through our efforts in data collection and data generation, we are able to train a NMT model that provides good translations for short to medium sentences, and outperforms GoogleTranslate overall. We focus on the correspondence of the Swiss reformer Heinrich Bullinger, but our parallel corpus and our NMT system will be of use for many other texts of the time.

Co-authors

Martin Volk 4

Patricia Scheurer 3

Phillip B. Ströbel 1

Phillip Benjamin Ströbel 1

Phillip Ströbel 1

Simon Clematide 1

Tobias Hodel 1