Abstract
This work describes the first thorough analysis of “header” signs in proto-Elamite, an undeciphered script from 3100-2900 BCE. Headers are a category of signs which have been provisionally identified through painstaking manual analysis of this script by domain experts. We use unsupervised neural and statistical sequence modeling techniques to provide new and independent evidence for the existence of headers, without supervision from domain experts. Having affirmed the existence of headers as a legitimate structural feature, we next arrive at a richer understanding of their possible meaning and purpose by (i) examining which features predict their presence; (ii) identifying correlations between these features and other document properties; and (iii) examining cases where these features predict the presence of a header in texts where domain experts do not expect one (or vice versa). We provide more concrete processes for labeling headers in this corpus and a clearer justification for existing intuitions about document structure in proto-Elamite.- Anthology ID:
- 2022.emnlp-main.620
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9111–9121
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.620
- DOI:
- 10.18653/v1/2022.emnlp-main.620
- Cite (ACL):
- Logan Born, M. Monroe, Kathryn Kelley, and Anoop Sarkar. 2022. Sequence Models for Document Structure Identification in an Undeciphered Script. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9111–9121, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Sequence Models for Document Structure Identification in an Undeciphered Script (Born et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.emnlp-main.620.pdf