Learning Generic Sentence Representations Using Convolutional Neural Networks
Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, Lawrence Carin
Abstract
We propose a new encoder-decoder approach to learn distributed sentence representations that are applicable to multiple purposes. The model is learned by using a convolutional neural network as an encoder to map an input sentence into a continuous vector, and using a long short-term memory recurrent neural network as a decoder. Several tasks are considered, including sentence reconstruction and future sentence prediction. Further, a hierarchical encoder-decoder model is proposed to encode a sentence to predict multiple future sentences. By training our models on a large collection of novels, we obtain a highly generic convolutional sentence encoder that performs well in practice. Experimental results on several benchmark datasets, and across a broad range of applications, demonstrate the superiority of the proposed model over competing methods.- Anthology ID:
- D17-1254
- Volume:
- Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Editors:
- Martha Palmer, Rebecca Hwa, Sebastian Riedel
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2390–2400
- Language:
- URL:
- https://aclanthology.org/D17-1254
- DOI:
- 10.18653/v1/D17-1254
- Cite (ACL):
- Zhe Gan, Yunchen Pu, Ricardo Henao, Chunyuan Li, Xiaodong He, and Lawrence Carin. 2017. Learning Generic Sentence Representations Using Convolutional Neural Networks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2390–2400, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Learning Generic Sentence Representations Using Convolutional Neural Networks (Gan et al., EMNLP 2017)
- PDF:
- https://preview.aclanthology.org/teach-a-man-to-fish/D17-1254.pdf
- Data
- BookCorpus, MPQA Opinion Corpus, MS COCO