Sign Clustering and Topic Extraction in Proto-Elamite
Logan Born, Kate Kelley, Nishant Kambhatla, Carolyn Chen, Anoop Sarkar
Abstract
We describe a first attempt at using techniques from computational linguistics to analyze the undeciphered proto-Elamite script. Using hierarchical clustering, n-gram frequencies, and LDA topic models, we both replicate results obtained by manual decipherment and reveal previously-unobserved relationships between signs. This demonstrates the utility of these techniques as an aid to manual decipherment.- Anthology ID:
- W19-2516
- Volume:
- Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, USA
- Editors:
- Beatrice Alex, Stefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
- Venue:
- LaTeCH
- SIG:
- SIGHUM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 122–132
- Language:
- URL:
- https://aclanthology.org/W19-2516
- DOI:
- 10.18653/v1/W19-2516
- Cite (ACL):
- Logan Born, Kate Kelley, Nishant Kambhatla, Carolyn Chen, and Anoop Sarkar. 2019. Sign Clustering and Topic Extraction in Proto-Elamite. In Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 122–132, Minneapolis, USA. Association for Computational Linguistics.
- Cite (Informal):
- Sign Clustering and Topic Extraction in Proto-Elamite (Born et al., LaTeCH 2019)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/W19-2516.pdf
- Code
- sfu-natlang/pe-decipher-toolkit