TækTåk: Syntactic Analysis of Language Use on Danish TikTok

Thea Kristensen, Rob van der Goot


Abstract
Language use is different across different language communities. Social media provides a rich source for studying how language varies, as it contains large data for a wide variety of sub-communities. In this paper, we study language usage on Danish TikTok. TikTok is a video-based platform, but most users are mainly active in the text-based comment sections. With the goal of analyzing language usage on this language variety, we contribute: 1) the first Danish social media treebank annotated for Universal Dependencies 2) evaluation of a variety of parsers using the new treebank, showing that cross-lingual in-domain data provides a valuable signal 3) a comparison of syntactic trends on standard Danish languages and TikTok language.
Anthology ID:
2026.lrec-main.902
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
11524–11534
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.902/
DOI:
Bibkey:
Cite (ACL):
Thea Kristensen and Rob van der Goot. 2026. TækTåk: Syntactic Analysis of Language Use on Danish TikTok. International Conference on Language Resources and Evaluation, main:11524–11534.
Cite (Informal):
TækTåk: Syntactic Analysis of Language Use on Danish TikTok (Kristensen & van der Goot, LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.902.pdf