pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Mandar Joshi, Eunsol Choi, Omer Levy, Daniel Weld, Luke Zettlemoyer
Abstract
Reasoning about implied relationships (e.g. paraphrastic, common sense, encyclopedic) between pairs of words is crucial for many cross-sentence inference problems. This paper proposes new methods for learning and using embeddings of word pairs that implicitly represent background knowledge about such relationships. Our pairwise embeddings are computed as a compositional function of each word’s representation, which is learned by maximizing the pointwise mutual information (PMI) with the contexts in which the the two words co-occur. We add these representations to the cross-sentence attention layer of existing inference models (e.g. BiDAF for QA, ESIM for NLI), instead of extending or replacing existing word embeddings. Experiments show a gain of 2.7% on the recently released SQuAD 2.0 and 1.3% on MultiNLI. Our representations also aid in better generalization with gains of around 6-7% on adversarial SQuAD datasets, and 8.8% on the adversarial entailment test set by Glockner et al. (2018).- Anthology ID:
- N19-1362
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Jill Burstein, Christy Doran, Thamar Solorio
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3597–3608
- Language:
- URL:
- https://aclanthology.org/N19-1362
- DOI:
- 10.18653/v1/N19-1362
- Cite (ACL):
- Mandar Joshi, Eunsol Choi, Omer Levy, Daniel Weld, and Luke Zettlemoyer. 2019. pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3597–3608, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference (Joshi et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/N19-1362.pdf
- Code
- mandarjoshi90/pair2vec + additional community code
- Data
- MultiNLI, SNLI, SQuAD