Abstract
We introduce a simple method for extracting non-arbitrary form-meaning representations from a collection of semantic vectors. We treat the problem as one of feature selection for a model trained to predict word vectors from subword features. We apply this model to the problem of automatically discovering phonesthemes, which are submorphemic sound clusters that appear in words with similar meaning. Many of our model-predicted phonesthemes overlap with those proposed in the linguistics literature, and we validate our approach with human judgments.- Anthology ID:
- W18-1206
- Volume:
- Proceedings of the Second Workshop on Subword/Character LEvel Models
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans
- Venue:
- SCLeM
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 49–54
- Language:
- URL:
- https://aclanthology.org/W18-1206
- DOI:
- 10.18653/v1/W18-1206
- Cite (ACL):
- Nelson F. Liu, Gina-Anne Levow, and Noah A. Smith. 2018. Discovering Phonesthemes with Sparse Regularization. In Proceedings of the Second Workshop on Subword/Character LEvel Models, pages 49–54, New Orleans. Association for Computational Linguistics.
- Cite (Informal):
- Discovering Phonesthemes with Sparse Regularization (Liu et al., SCLeM 2018)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/W18-1206.pdf