Diachronic degradation of language models: Insights from social media

Kokil Jaidka, Niyati Chhaya, Lyle Ungar


Abstract
Natural languages change over time because they evolve to the needs of their users and the socio-technological environment. This study investigates the diachronic accuracy of pre-trained language models for downstream tasks in machine learning and user profiling. It asks the question: given that the social media platform and its users remain the same, how is language changing over time? How can these differences be used to track the changes in the affect around a particular topic? To our knowledge, this is the first study to show that it is possible to measure diachronic semantic drifts within social media and within the span of a few years.
Anthology ID:
P18-2032
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
195–200
Language:
URL:
https://aclanthology.org/P18-2032
DOI:
10.18653/v1/P18-2032
Bibkey:
Cite (ACL):
Kokil Jaidka, Niyati Chhaya, and Lyle Ungar. 2018. Diachronic degradation of language models: Insights from social media. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 195–200, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Diachronic degradation of language models: Insights from social media (Jaidka et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/P18-2032.pdf
Poster:
 P18-2032.Poster.pdf