Abstract
We assess the reliability and accuracy of (neural) word embeddings for both modern and historical English and German. Our research provides deeper insights into the empirically justified choice of optimal training methods and parameters. The overall low reliability we observe, nevertheless, casts doubt on the suitability of word neighborhoods in embedding spaces as a basis for qualitative conclusions on synchronic and diachronic lexico-semantic matters, an issue currently high up in the agenda of Digital Humanities.- Anthology ID:
- C16-1262
- Volume:
- Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Venue:
- COLING
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 2785–2796
- Language:
- URL:
- https://aclanthology.org/C16-1262
- DOI:
- Cite (ACL):
- Johannes Hellrich and Udo Hahn. 2016. Bad Company—Neighborhoods in Neural Embedding Spaces Considered Harmful. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2785–2796, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Bad Company—Neighborhoods in Neural Embedding Spaces Considered Harmful (Hellrich & Hahn, COLING 2016)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/C16-1262.pdf
- Code
- hellrich/coling2016