Abstract
We present an approach to detect differences in lexical semantics across English language registers, using word embedding models from distributional semantics paradigm. Models trained on register-specific subcorpora of the BNC corpus are employed to compare lists of nearest associates for particular words and draw conclusions about their semantic shifts depending on register in which they are used. The models are evaluated on the task of register classification with the help of the deep inverse regression approach. Additionally, we present a demo web service featuring most of the described models and allowing to explore word meanings in different English registers and to detect register affiliation for arbitrary texts. The code for the service can be easily adapted to any set of underlying models.- Anthology ID:
- W16-4005
- Volume:
- Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Erhard Hinrichs, Marie Hinrichs, Thorsten Trippel
- Venue:
- LT4DH
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 26–34
- Language:
- URL:
- https://aclanthology.org/W16-4005
- DOI:
- Cite (ACL):
- Andrey Kutuzov, Elizaveta Kuzmenko, and Anna Marakasova. 2016. Exploration of register-dependent lexical semantics using word embeddings. In Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), pages 26–34, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Exploration of register-dependent lexical semantics using word embeddings (Kutuzov et al., LT4DH 2016)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/W16-4005.pdf
- Code
- ElizavetaKuzmenko/dsm_genres