Fabian Retkowski


2025

pdf bib
Zero-Shot Strategies for Length-Controllable Summarization
Fabian Retkowski | Alexander Waibel
Findings of the Association for Computational Linguistics: NAACL 2025

Large language models (LLMs) struggle with precise length control, particularly in zero-shot settings. We conduct a comprehensive study evaluating LLMs’ length control capabilities across multiple measures and propose practical methods to improve controllability. Our experiments with LLaMA 3 reveal stark differences in length adherence across measures and highlight inherent biases of the model. To address these challenges, we introduce a set of methods: length approximation, target adjustment, sample filtering, and automated revisions. By combining these methods, we demonstrate substantial improvements in length compliance while maintaining or enhancing summary quality, providing highly effective zero-shot strategies for precise length control without the need for model fine-tuning or architectural changes. With our work, we not only advance our understanding of LLM behavior in controlled text generation but also pave the way for more reliable and adaptable summarization systems in real-world applications.

pdf bib
The AI Co-Ethnographer: How Far Can Automation Take Qualitative Research?
Fabian Retkowski | Andreas Sudmann | Alexander Waibel
Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities

Qualitative research often involves labor-intensive processes that are difficult to scale while preserving analytical depth. This paper introduces The AI Co-Ethnographer (AICoE), a novel end-to-end pipeline developed for qualitative research and designed to move beyond the limitations of simply automating code assignments, offering a more integrated approach. AICoE organizes the entire process, encompassing open coding, code consolidation, code application, and even pattern discovery, leading to a comprehensive analysis of qualitative data.

2024

pdf bib
From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions
Fabian Retkowski | Alexander Waibel
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Text segmentation is a fundamental task in natural language processing, where documents are split into contiguous sections. However, prior research in this area has been constrained by limited datasets, which are either small in scale, synthesized, or only contain well-structured documents. In this paper, we address these limitations by introducing a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse. As part of this work, we introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines. Lastly, we expand the notion of text segmentation to a more practical “smart chaptering” task that involves the segmentation of unstructured content, the generation of meaningful segment titles, and a potential real-time application of the models.

2023

pdf bib
End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
Christian Huber | Tu Anh Dinh | Carlos Mullov | Ngoc-Quan Pham | Thai Binh Nguyen | Fabian Retkowski | Stefan Constantin | Enes Ugan | Danni Liu | Zhaolin Li | Sai Koneru | Jan Niehues | Alexander Waibel
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

The challenge of low-latency speech translation has recently draw significant interest in the research community as shown by several publications and shared tasks. Therefore, it is essential to evaluate these different approaches in realistic scenarios. However, currently only specific aspects of the systems are evaluated and often it is not possible to compare different approaches. In this work, we propose the first framework to perform and evaluate the various aspects of low-latency speech translation under realistic conditions. The evaluation is carried out in an end-to-end fashion. This includes the segmentation of the audio as well as the run-time of the different components. Secondly, we compare different approaches to low-latency speech translation using this framework. We evaluate models with the option to revise the output as well as methods with fixed output. Furthermore, we directly compare state-of-the-art cascaded as well as end-to-end systems. Finally, the framework allows to automatically evaluate the translation quality as well as latency and also provides a web interface to show the low-latency model outputs to the user.