Paul Schmitt


2025

pdf bib
KR Labs at ArchEHR-QA 2025: A Verbatim Approach for Evidence-Based Question Answering
Adam Kovacs | Paul Schmitt | Gabor Recski
BioNLP 2025 Shared Tasks

We present a lightweight, domain‐agnostic verbatim pipeline for evidence‐grounded question answering. Our pipeline operates in two steps: first, a sentence-level extractor flags relevant note sentences using either zero-shot LLM prompts or supervised ModernBERT classifiers. Next, an LLM drafts a question-specific template, which is filled verbatim with sentences from the extraction step. This prevents hallucinations and ensures traceability. In the ArchEHR‐QA 2025 shared task, our system scored 42.01%, ranking top‐10 in core metrics and outperforming the organiser’s 70B‐parameter Llama‐3.3 baseline. We publicly release our code and inference scripts under an MIT license.

2024

pdf bib
TPPMI - a Temporal Positive Pointwise Mutual Information Embedding of Words
Paul Schmitt | Zsófia Rakovics | Márton Rakovics | Gábor Recski
Proceedings of the 4th Workshop on Computational Linguistics for the Political and Social Sciences: Long and short papers

We present Temporal Positive Pointwise Mutual Information (TPPMI) embeddings as a robust and data-efficient alternative for modeling temporal semantic change. Based on the assumption that the semantics of the most frequent words in a corpus are relatively stable over time, our model represents words as vectors of their PPMI similarities with a predefined set of such context words. We evaluate our method on the temporal word analogy benchmark of Yao et al. (2018) and compare it to the TWEC model (Di Carlo et al., 2019), demonstrating the competitiveness of the approach. While the performance of TPPMI stays below that of the state-of-the-art TWEC model, it offers a higher degree of interpretability and is applicable in scenarios where only a limited amount of data is available.