Robin Winkle


2026

Biographical sources, such as literature encyclopedias, encode knowledge about historical figures in textual form. In this paper, we address the task of consolidating structured biographical information about authors from the former German Democratic Republic into a unified database. To this end, we present a generalizable Information Extraction (IE) system based on LLM prompting. Specifically, we compare two midsized open-source models, Qwen-2.5-32B and Llama-3-70B-Instruct, investigate a range of Prompt Engineering (PE) strategies, and propose a semantic similarity-based evaluation metric for open-ended IE. Our experiments on an unpublished annotated subset of biographical texts deliver moderate precision and variable recall, highlighting both the potential and current limitations of generative IE in the Digital Humanities.