Abstract
The prevalence of informal language such as slang presents challenges for natural language systems, particularly in the automatic discovery of flexible word usages. Previous work has explored slang in terms of dictionary construction, sentiment analysis, word formation, and interpretation, but scarce research has attempted the basic problem of slang detection and identification. We examine the extent to which deep learning methods support automatic detection and identification of slang from natural sentences using a combination of bidirectional recurrent neural networks, conditional random field, and multilayer perceptron. We test these models based on a comprehensive set of linguistic features in sentence-level detection and token-level identification of slang. We found that a prominent feature of slang is the surprising use of words across syntactic categories or syntactic shift (e.g., verb-noun). Our best models detect the presence of slang at the sentence level with an F1-score of 0.80 and identify its exact position at the token level with an F1-Score of 0.50.- Anthology ID:
- K19-1082
- Volume:
- Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 881–889
- Language:
- URL:
- https://aclanthology.org/K19-1082
- DOI:
- 10.18653/v1/K19-1082
- Cite (ACL):
- Zhengqi Pei, Zhewei Sun, and Yang Xu. 2019. Slang Detection and Identification. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 881–889, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Slang Detection and Identification (Pei et al., CoNLL 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/K19-1082.pdf