Abstract
The research question we explore in this study is how to obtain syntactically plausible word representations without using human annotations. Our underlying hypothesis is that word ordering tests, or linearizations, is suitable for learning syntactic knowledge about words. To verify this hypothesis, we develop a differentiable model called Word Ordering Network (WON) that explicitly learns to recover correct word order while implicitly acquiring word embeddings representing syntactic knowledge. We evaluate the word embeddings produced by the proposed method on downstream syntax-related tasks such as part-of-speech tagging and dependency parsing. The experimental results demonstrate that the WON consistently outperforms both order-insensitive and order-sensitive baselines on these tasks.- Anthology ID:
- I17-1008
- Volume:
- Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- November
- Year:
- 2017
- Address:
- Taipei, Taiwan
- Venue:
- IJCNLP
- SIG:
- Publisher:
- Asian Federation of Natural Language Processing
- Note:
- Pages:
- 70–79
- Language:
- URL:
- https://aclanthology.org/I17-1008
- DOI:
- Cite (ACL):
- Noriki Nishida and Hideki Nakayama. 2017. Word Ordering as Unsupervised Learning Towards Syntactically Plausible Word Representations. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 70–79, Taipei, Taiwan. Asian Federation of Natural Language Processing.
- Cite (Informal):
- Word Ordering as Unsupervised Learning Towards Syntactically Plausible Word Representations (Nishida & Nakayama, IJCNLP 2017)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/I17-1008.pdf
- Code
- norikinishida/won
- Data
- BookCorpus, Penn Treebank