Distinguishing Japanese Non-standard Usages from Standard Ones
Tatsuya Aoki, Ryohei Sasano, Hiroya Takamura, Manabu Okumura
Abstract
We focus on non-standard usages of common words on social media. In the context of social media, words sometimes have other usages that are totally different from their original. In this study, we attempt to distinguish non-standard usages on social media from standard ones in an unsupervised manner. Our basic idea is that non-standardness can be measured by the inconsistency between the expected meaning of the target word and the given context. For this purpose, we use context embeddings derived from word embeddings. Our experimental results show that the model leveraging the context embedding outperforms other methods and provide us with findings, for example, on how to construct context embeddings and which corpus to use.- Anthology ID:
- D17-1246
- Volume:
- Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Editors:
- Martha Palmer, Rebecca Hwa, Sebastian Riedel
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2323–2328
- Language:
- URL:
- https://aclanthology.org/D17-1246
- DOI:
- 10.18653/v1/D17-1246
- Cite (ACL):
- Tatsuya Aoki, Ryohei Sasano, Hiroya Takamura, and Manabu Okumura. 2017. Distinguishing Japanese Non-standard Usages from Standard Ones. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2323–2328, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Distinguishing Japanese Non-standard Usages from Standard Ones (Aoki et al., EMNLP 2017)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/D17-1246.pdf