Maciej Rapacz

2026

Degree Zero of Translation: Using Interlinear Baselines to Quantify Translator Intervention
Maciej Rapacz | Aleksander Smywiński-Pohl
Proceedings of the 10th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature 2026

Literary translation is rarely a neutral act of linguistic transfer, but rather a continuous series of conscious interventions - restructuring, semantic shifts, and stylistic adaptations. While Translation Studies analyzes these shifts qualitatively, current computational methods focus primarily on quality evaluation (e.g., BLEU, COMET) or authorship attribution (e.g., stylometry), lacking a scalable metric to quantify the extent and character of the translator’s intervention. We propose a novel method to measure the translator’s signal by using Interlinear Translation - a strict word-for-word gloss - as a computational baseline representing translational "Degree Zero," i.e., a neutral form of source text devoid of any stylistic adaptation.We define the Intervention Vector as the semantic difference between a literary translation and its interlinear counterpart in a high-dimensional vector space. We validate this approach on a multilingual corpus of the Greek New Testament translations comprising 5 interlinear baselines and 74 literary translations across 5 languages: English (16), French (14), Italian (12), Polish (16), and Spanish (16).Our results demonstrate that the magnitude of the Intervention Vector effectively ranks texts along a spectrum from literal to paraphrase, aligning with established theoretical categories. We find that this magnitude consistently distinguishes between translation strategies, yielding significantly longer vectors for dynamic and paraphrase strategies compared to literal and formal ones. This framework provides a quantitative method for analyzing translator agency without the need for a comprehensive corpus of reference translations.

2025

pdf bib abs

Low-Resource Interlinear Translation: Morphology-Enhanced Neural Models for Ancient Greek
Maciej Rapacz | Aleksander Smywiński-Pohl
Proceedings of the First Workshop on Language Models for Low-Resource Languages

Contemporary machine translation systems prioritize fluent, natural-sounding output with flexible word ordering. In contrast, interlinear translation maintains the source text’s syntactic structure by aligning target language words directly beneath their source counterparts. Despite its importance in classical scholarship, automated approaches to interlinear translation remain understudied. We evaluated neural interlinear translation from Ancient Greek to English and Polish using four transformer-based models: two Ancient Greek-specialized (GreTa and PhilTa) and two general-purpose multilingual models (mT5-base and mT5-large). Our approach introduces novel morphological embedding layers and evaluates text preprocessing and tag set selection across 144 experimental configurations using a word-aligned parallel corpus of the Greek New Testament. Results show that morphological features through dedicated embedding layers significantly enhance translation quality, improving BLEU scores by 35% (44.67 → 60.40) for English and 38% (42.92 → 59.33) for Polish compared to baseline models. PhilTa achieves state-of-the-art performance for English, while mT5-large does so for Polish. Notably, PhilTa maintains stable performance using only 10% of training data. Our findings challenge the assumption that modern neural architectures cannot benefit from explicit morphological annotations. While preprocessing strategies and tag set selection show minimal impact, the substantial gains from morphological embeddings demonstrate their value in low-resource scenarios.

Co-authors

Aleksander Smywiński-Pohl 2

Venues

Fix author