Abstract
Word embeddings have been demonstrated to benefit NLP tasks impressively. Yet, there is room for improvements in the vector representations, because current word embeddings typically contain unnecessary information, i.e., noise. We propose two novel models to improve word embeddings by unsupervised learning, in order to yield word denoising embeddings. The word denoising embeddings are obtained by strengthening salient information and weakening noise in the original word embeddings, based on a deep feed-forward neural network filter. Results from benchmark tasks show that the filtered word denoising embeddings outperform the original word embeddings.- Anthology ID:
- C16-1254
- Volume:
- Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Yuji Matsumoto, Rashmi Prasad
- Venue:
- COLING
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 2699–2707
- Language:
- URL:
- https://aclanthology.org/C16-1254
- DOI:
- Cite (ACL):
- Kim Anh Nguyen, Sabine Schulte im Walde, and Ngoc Thang Vu. 2016. Neural-based Noise Filtering from Word Embeddings. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2699–2707, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Neural-based Noise Filtering from Word Embeddings (Nguyen et al., COLING 2016)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/C16-1254.pdf
- Code
- nguyenkh/NeuralDenoising