MucLex: A German Lexicon for Surface Realisation
Kira Klimt | Daniel Braun | Daniela Schneider | Florian Matthes
Proceedings of the Twelfth Language Resources and Evaluation Conference

Language resources for languages other than English are often scarce. Rule-based surface realisers need elaborate lexica in order to be able to generate correct language, especially in languages like German, which include many irregular word forms. In this paper, we present MucLex, a German lexicon for the Natural Language Generation task of surface realisation, based on the crowd-sourced online lexicon Wiktionary. MucLex contains more than 100,000 lemmata and more than 670,000 different word forms in a well-structured XML file and is available under the Creative Commons BY-SA 3.0 license.


SimpleNLG-DE: Adapting SimpleNLG 4 to German
Daniel Braun | Kira Klimt | Daniela Schneider | Florian Matthes
Proceedings of the 12th International Conference on Natural Language Generation

SimpleNLG is a popular open source surface realiser for the English language. For German, however, the availability of open source and non-domain specific realisers is sparse, partly due to the complexity of the German language. In this paper, we present SimpleNLG-DE, an adaption of SimpleNLG to German. We discuss which parts of the German language have been implemented and how we evaluated our implementation using the TIGER Corpus and newly created data-sets.