Reconstruction of Cuneiform Literary Texts as Text Matching
Fabian Simonjetz, Jussi Laasonen, Yunus Cobanoglu, Alexander Fraser, Enrique Jiménez
Abstract
Ancient Mesopotamian literature is riddled with gaps, caused by the decay and fragmentation of its writing material, clay tablets. The discovery of overlaps between fragments allows reconstruction to advance, but it is a slow and unsystematic process. Since new pieces are found and digitized constantly, NLP techniques can help to identify fragments and match them with existing text collections to restore complete literary works. We compare a number of approaches and determine that a character-level n-gram-based similarity matching approach works well for this problem, leading to a large speed-up for researchers in Assyriology.- Anthology ID:
- 2024.lrec-main.1197
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 13712–13721
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.1197
- DOI:
- Cite (ACL):
- Fabian Simonjetz, Jussi Laasonen, Yunus Cobanoglu, Alexander Fraser, and Enrique Jiménez. 2024. Reconstruction of Cuneiform Literary Texts as Text Matching. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 13712–13721, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Reconstruction of Cuneiform Literary Texts as Text Matching (Simonjetz et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.lrec-main.1197.pdf