Quantifying Text Reuse Across Three Kṛṣṇa Yajurveda Recensions: Using Multi-Algorithm Computational Collation

So Miyagawa, Kyoko Amano, Yuzuki Tsukagoshi, Yuki Kyogoku


Abstract
The Kṛṣṇa Yajurveda survives in multiple recensions that share substantial ritual content, yet the degree and distribution of textual overlap across recensions have never been quantified systematically. This paper presents a computational analysis of text reuse across three recensions—the Maitrāyaṇī Saṃhitā (MS), the Kāṭhaka Saṃhitā (KS), and the Taittirīya Saṃhitā (TS)—for two ritual sections (Agnyupasthāna and Punarādhāna), using ICoMa (Intertextuality Collation Machine), a new web-based multi-algorithm collation tool. Five independent similarity algorithms consistently rank MS–KS as the most closely related pair, corroborating the philological consensus. Crucially, the two ritual sections exhibit strikingly different reuse profiles: Punarādhāna shows near-identical MS–KS overlap (up to 93.5%) with sharp divergence from TS, while Agnyupasthāna displays moderate, broadly distributed similarity across all three pairs. These contrasting patterns provide quantitative evidence that different ritual categories followed distinct paths of textual transmission within the Yajurvedic tradition. ICoMa and the experimental data are freely available.
Anthology ID:
2026.nlp4dh-1.5
Volume:
Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities
Month:
July
Year:
2026
Address:
San Diego, USA
Editors:
Sil Hamilton, Emily Öhman, Rebecca M. M. Hicke, Yuri Bizzoni, Axel Bax, Jacob A. Matthews, Mika Hämäläinen
Venues:
NLP4DH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
41–49
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.nlp4dh-1.5/
DOI:
Bibkey:
Cite (ACL):
So Miyagawa, Kyoko Amano, Yuzuki Tsukagoshi, and Yuki Kyogoku. 2026. Quantifying Text Reuse Across Three Kṛṣṇa Yajurveda Recensions: Using Multi-Algorithm Computational Collation. In Proceedings of the 6th International Conference on Natural Language Processing for the Digital Humanities, pages 41–49, San Diego, USA. Association for Computational Linguistics.
Cite (Informal):
Quantifying Text Reuse Across Three Kṛṣṇa Yajurveda Recensions: Using Multi-Algorithm Computational Collation (Miyagawa et al., NLP4DH 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.nlp4dh-1.5.pdf