Zeerak Talat

Other people with similar names: Zeerak Talat

Unverified author pages with similar names: Zeerak Talat

2026

IYKYK: Using language models to decode extremist cryptolects
Christine de Kock | Arij Riabi | Zeerak Talat | Michael Sejr Schlichtkrull | Pranava Madhyastha | Eduard Hovy
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

Extremist groups develop complex in-group language, also referred to as cryptolects, to exclude or mislead outsiders. We investigate the ability of current language technologies to detect and interpret the cryptolects of two online extremist platforms. Evaluating eight models across six tasks, our results indicate that general purpose LLMs cannot consistently detect or decode extremist language. However, performance can be significantly improved by domain adaptation and specialised prompting techniques. These results provide important insights to inform the development and deployment of automated moderation technologies. We further develop and release novel labelled and unlabelled datasets, including 19.4M posts from extremist platforms and lexicons validated by human experts.

pdf bib abs

FedMental: Evaluating Federated Learning for Mental Health Detection from Social Media Data
Nuredin Ali Abdelkadir | Anjali Ratnam | Zeerak Talat | Stevie Chancellor
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Social media text data are often used to train Machine Learning (ML) models to identify users exhibiting high-risk mental health behaviors. However, sharing this sensitive data poses privacy risks and limits the growth of benchmark datasets. We comprehensively evaluate whether privacy-preserving ML techniques can enable safer data sharing while preserving performance. Specifically, we apply federatedlearning (FL) and Differentially Private FL for two widely-studied mental health prediction tasks: depression detection on X (Twitter) and suicide crisis detection on Reddit. We simulate realistic data-sharing scenarios by treating each user as a client in a non-IID setting, evaluating across different client fractions, aggregation strategies, and privacy budgets. While FL achieves comparable performance to centralized training (centralized 𝐹 1 = 85.63; best FL model 𝐹 1 = 83.16) on depression identification, we find that Differentially Private FL has a large performance-privacy trade-off (up to 𝐹 1 = 27.01 drop) even with low levels of noise (𝜖 = 50). This is due to the distortion of highly informative yet sparse mental health linguistic markers related to mental health, like health topics and emotion words. This research empirically demonstrates the potential and limitations of current privacy preservation techniques for mental health inference tasks.

Co-authors

Arij Riabi 1

Michael Schlichtkrull 1

Christine de Kock 1

Venues

ACL1
EACL1

Fix author