A Dataset of Mycenaean Linear B Sequences

Katerina Papavassiliou, Gareth Owens, Dimitrios Kosmopoulos


Abstract
We present our work towards a dataset of Mycenaean Linear B sequences gathered from the Mycenaean inscriptions written in the 13th and 14th century B.C. (c. 1400-1200 B.C.). The dataset contains sequences of Mycenaean words and ideograms according to the rules of the Mycenaean Greek language in the Late Bronze Age. Our ultimate goal is to contribute to the study, reading and understanding of ancient scripts and languages. Focusing on sequences, we seek to exploit the structure of the entire language, not just the Mycenaean vocabulary, to analyse sequential patterns. We use the dataset to experiment on estimating the missing symbols in damaged inscriptions.
Anthology ID:
2020.lrec-1.311
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2552–2561
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.311
DOI:
Bibkey:
Cite (ACL):
Katerina Papavassiliou, Gareth Owens, and Dimitrios Kosmopoulos. 2020. A Dataset of Mycenaean Linear B Sequences. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2552–2561, Marseille, France. European Language Resources Association.
Cite (Informal):
A Dataset of Mycenaean Linear B Sequences (Papavassiliou et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.lrec-1.311.pdf