Maxim Konca


2024

pdf
German SRL: Corpus Construction and Model Training
Maxim Konca | Andy Luecking | Alexander Mehler
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

A useful semantic role-annotated resource for training semantic role models for the German language is missing. We point out some problems of previous resources and provide a new one due to a combined translation and alignment process: The gold standard CoNLL-2012 semantic role annotations are translated into German. Semantic role labels are transferred due to alignment models. The resulting dataset is used to train a German semantic role model. With F1-scores around 0.7, the major roles achieve competitive evaluation scores, but avoid limitations of previous approaches. The described procedure can be applied to other languages as well.