Abstract
Language is an important marker of a cultural group, large or small. One aspect of language variation between communities is the employment of highly specialized terms with unique significance to the group. We study these high affinity terms across a wide variety of communities by leveraging the rich diversity of Reddit.com. We provide a systematic exploration of high affinity terms, the often rapid semantic shifts they undergo, and their relationship to subreddit characteristics across 2600 diverse subreddits. Our results show that high affinity terms are effective signals of loyal communities, they undergo more semantic shift than low affinity terms, and that they are partial barrier to entry for new users. We conclude that Reddit is a robust and valuable data source for testing further theories about high affinity terms across communities.- Anthology ID:
- D19-5508
- Volume:
- Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venue:
- WNUT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 57–67
- Language:
- URL:
- https://aclanthology.org/D19-5508
- DOI:
- 10.18653/v1/D19-5508
- Cite (ACL):
- Abhinav Bhandari and Caitrin Armstrong. 2019. Tkol, Httt, and r/radiohead: High Affinity Terms in Reddit Communities. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 57–67, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Tkol, Httt, and r/radiohead: High Affinity Terms in Reddit Communities (Bhandari & Armstrong, WNUT 2019)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/D19-5508.pdf