Encoding Prior Knowledge with Eigenword Embeddings

Dominique Osborne; Shashi Narayan; Shay B. Cohen

doi:10.1162/tacl_a_00108

Encoding Prior Knowledge with Eigenword Embeddings

Dominique Osborne, Shashi Narayan, Shay B. Cohen

Abstract

Canonical correlation analysis (CCA) is a method for reducing the dimension of data represented using two views. It has been previously used to derive word embeddings, where one view indicates a word, and the other view indicates its context. We describe a way to incorporate prior knowledge into CCA, give a theoretical justification for it, and test it by deriving word embeddings and evaluating them on a myriad of datasets.

Anthology ID:: Q16-1030
Volume:: Transactions of the Association for Computational Linguistics, Volume 4
Month:
Year:: 2016
Address:: Cambridge, MA
Venue:: TACL
SIG:
Publisher:: MIT Press
Note:
Pages:: 417–430
Language:
URL:: https://aclanthology.org/Q16-1030
DOI:: 10.1162/tacl_a_00108
Bibkey:
Cite (ACL):: Dominique Osborne, Shashi Narayan, and Shay B. Cohen. 2016. Encoding Prior Knowledge with Eigenword Embeddings. Transactions of the Association for Computational Linguistics, 4:417–430.
Cite (Informal):: Encoding Prior Knowledge with Eigenword Embeddings (Osborne et al., TACL 2016)
Copy Citation:
PDF:: https://preview.aclanthology.org/remove-xml-comments/Q16-1030.pdf
Data: FrameNet

PDF Search