RAG-Enhanced Neural Machine Translation of Ancient Egyptian Text: A Case Study of THOTH AI

So Miyagawa


Abstract
This paper demonstrates how Retrieval-Augmented Generation (RAG) significantly improves translation accuracy for Middle Egyptian, a historically rich but low-resource language. We integrate a vectorized Coptic-Egyptian lexicon and morphological database into a specialized tool called THOTH AI. By supplying domain-specific linguistic knowledge to Large Language Models (LLMs) like Claude 3.5 Sonnet, our system yields translations that are more contextually grounded and semantically precise. We compare THOTH AI against various mainstream models, including Gemini 2.0, DeepSeek R1, and GPT variants, evaluating performance with BLEU, SacreBLEU, METEOR, ROUGE, and chrF. Experimental results on the coronation decree of Thutmose I (18th Dynasty) show that THOTH AI’s RAG approach provides the most accurate translations, highlighting the critical value of domain knowledge in natural language processing for ancient, specialized corpora. Furthermore, we discuss how our method benefits e-learning, digital humanities, and language revitalization efforts, bridging the gap between purely data-driven approaches and expert-driven resources in historical linguistics.
Anthology ID:
2025.nlp4dh-1.4
Volume:
Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities
Month:
May
Year:
2025
Address:
Albuquerque, USA
Editors:
Mika Hämäläinen, Emily Öhman, Yuri Bizzoni, So Miyagawa, Khalid Alnajjar
Venues:
NLP4DH | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–40
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.nlp4dh-1.4/
DOI:
Bibkey:
Cite (ACL):
So Miyagawa. 2025. RAG-Enhanced Neural Machine Translation of Ancient Egyptian Text: A Case Study of THOTH AI. In Proceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, pages 33–40, Albuquerque, USA. Association for Computational Linguistics.
Cite (Informal):
RAG-Enhanced Neural Machine Translation of Ancient Egyptian Text: A Case Study of THOTH AI (Miyagawa, NLP4DH 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.nlp4dh-1.4.pdf