Damian Stachura

2026

Visual-language models (VLMs) are rapidly advancing on tasks that require visual understanding of text, tables, plots, and diagrams. Yet extracting structured information from text-heavy scientific diagrams remains challenging, as it requires not only OCR but also recovery of layout, grouping, and flow relationships. We study this problem in the context of CONSORT flow diagrams, which summarize participant screening, randomization, follow-up, and analysis in randomized controlled trials. We introduce a 200-example benchmark of PubMed Central diagrams, annotated by a biomedical team specializing in systematic literature reviews and clinical evidence extraction, and evaluate schema-constrained CONSORT extraction across proprietary and open-weight model families. Using structure-aware metrics, we compare single-pass and stepwise extraction strategies. Expert-guided single-pass extraction performs best for proprietary frontier models, with Gemini 3 Pro achieving the strongest overall results, whereas stepwise prompting improves less capable open-weight models on challenging arm-level extraction. These results offer practical deployment guidance and suggest that high-quality schema-constrained extraction is feasible, but not yet solved.

2025

pdf bib abs

Perplexity-Driven Contrastive Scoring for Unsupervised Detection of AI-Generated Texts in Polish
Damian Stachura
Proceedings of the PolEval 2025 Workshop

The SMIGIEL competition at PolEval 2025 focuses on distinguishing Polish human-written text from AI-generated text. I participated in one of the subtasks that required a zero-shot detection method. My solution adapts the Binoculars detector by pairing language models and using calibrated thresholds. Specifically, I replaced the English language models from the original Binoculars method with models trained on Polish corpora. This approach achieved first place in the chosen competition track. Overall, my findings demonstrate that domain-specific language models and careful thresholding enable state-of-the-art zero-shot AI-text detection performance across new languages and domains. The code is publicly available at https://github.com/damian1996/2025-smigiel.

Co-authors

Venues

Fix author