A Low-Shot Prompting Approach to Lemmatization in the EvaCun 2025 Shared Task

John Sbur, Brandi Wilkins, Elizabeth Paul, Yudong Liu


Abstract
This study explores the use of low-shot prompt-ing techniques for the lemmatization of ancient cuneiform languages using Large Language Models (LLMs). To structure the input data and systematically design effective prompt tem-plates, we employed a hierarchical clustering approach based on Levenshtein distance The prompt design followed established engineer-ing patterns, incorporating instructional and response-guiding elements to enhance model comprehension. We employed the In-Context Learning (ICL) prompting strategy, selecting example words primarily based on lemma fre-quency, ensuring a balance between commonly occurring words and rare cases to improve gen-eralization. During testing on the develop-ment set, prompts included structured examples and explicit formatting rules, with accuracy assessed by comparing model predictions to ground truth lemmas. The results showed that model performance varied significantly across different configurations, with accuracy reach-ing approximately 90% in the best case for in-vocabulary words and around 9% in the best case for out-of-vocabulary (OOV) words. De-spite resource constraints and the lack of input from a language expert, oour findings suggest that prompt engineering strategies hold promise for improving LLM performance in cuneiform language lemmatization.
Anthology ID:
2025.alp-1.31
Volume:
Proceedings of the Second Workshop on Ancient Language Processing
Month:
May
Year:
2025
Address:
The Albuquerque Convention Center, Laguna
Editors:
Adam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti, Rachele Sprugnoli
Venues:
ALP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
232–236
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.alp-1.31/
DOI:
Bibkey:
Cite (ACL):
John Sbur, Brandi Wilkins, Elizabeth Paul, and Yudong Liu. 2025. A Low-Shot Prompting Approach to Lemmatization in the EvaCun 2025 Shared Task. In Proceedings of the Second Workshop on Ancient Language Processing, pages 232–236, The Albuquerque Convention Center, Laguna. Association for Computational Linguistics.
Cite (Informal):
A Low-Shot Prompting Approach to Lemmatization in the EvaCun 2025 Shared Task (Sbur et al., ALP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.alp-1.31.pdf