Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones
Zhenisbek Assylbekov, Rustem Takhanov, Bagdat Myrzakhmetov, Jonathan N. Washington
Abstract
Syllabification does not seem to improve word-level RNN language modeling quality when compared to character-based segmentation. However, our best syllable-aware language model, achieving performance comparable to the competitive character-aware model, has 18%-33% fewer parameters and is trained 1.2-2.2 times faster.- Anthology ID:
- D17-1199
- Volume:
- Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1866–1872
- Language:
- URL:
- https://aclanthology.org/D17-1199
- DOI:
- 10.18653/v1/D17-1199
- Cite (ACL):
- Zhenisbek Assylbekov, Rustem Takhanov, Bagdat Myrzakhmetov, and Jonathan N. Washington. 2017. Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1866–1872, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Syllable-aware Neural Language Models: A Failure to Beat Character-aware Ones (Assylbekov et al., EMNLP 2017)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/D17-1199.pdf
- Code
- zh3nis/lstm-syl
- Data
- Penn Treebank