Automated Phonological Transcription of Akkadian Cuneiform Text

Aleksi Sahala, Miikka Silfverberg, Antti Arppe, Krister Lindén


Abstract
Akkadian was an East-Semitic language spoken in ancient Mesopotamia. The language is attested on hundreds of thousands of cuneiform clay tablets. Several Akkadian text corpora contain only the transliterated text. In this paper, we investigate automated phonological transcription of the transliterated corpora. The phonological transcription provides a linguistically appealing form to represent Akkadian, because the transcription is normalized according to the grammatical description of a given dialect and explicitly shows the Akkadian renderings for Sumerian logograms. Because cuneiform text does not mark the inflection for logograms, the inflected form needs to be inferred from the sentence context. To the best of our knowledge, this is the first documented attempt to automatically transcribe Akkadian. Using a context-aware neural network model, we are able to automatically transcribe syllabic tokens at near human performance with 96% recall @ 3, while the logogram transcription remains more challenging at 82% recall @ 3.
Anthology ID:
2020.lrec-1.433
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
3528–3534
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.433
DOI:
Bibkey:
Cite (ACL):
Aleksi Sahala, Miikka Silfverberg, Antti Arppe, and Krister Lindén. 2020. Automated Phonological Transcription of Akkadian Cuneiform Text. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3528–3534, Marseille, France. European Language Resources Association.
Cite (Informal):
Automated Phonological Transcription of Akkadian Cuneiform Text (Sahala et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2020.lrec-1.433.pdf