Abstract
We consider two related problems in this paper. Given an undeciphered alphabetic writing system or mono-alphabetic cipher, determine: (1) which of its letters are vowels and which are consonants; and (2) whether the writing system is a vocalic alphabet or an abjad. We are able to show that a very simple spectral decomposition based on character co-occurrences provides nearly perfect performance with respect to answering both question types.- Anthology ID:
- W17-4112
- Volume:
- Proceedings of the First Workshop on Subword and Character Level Models in NLP
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Editors:
- Manaal Faruqui, Hinrich Schuetze, Isabel Trancoso, Yadollah Yaghoobzadeh
- Venue:
- SCLeM
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 82–91
- Language:
- URL:
- https://aclanthology.org/W17-4112
- DOI:
- 10.18653/v1/W17-4112
- Cite (ACL):
- Patricia Thaine and Gerald Penn. 2017. Vowel and Consonant Classification through Spectral Decomposition. In Proceedings of the First Workshop on Subword and Character Level Models in NLP, pages 82–91, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Vowel and Consonant Classification through Spectral Decomposition (Thaine & Penn, SCLeM 2017)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/W17-4112.pdf