SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset
Niyazi Ahmet Metin, Sevde Yılmaz, Osman Enes Erdoğdu, Elif Sude Meydan, Oğul Sümer, Dilara Keküllüoğlu
Abstract
Sarcasm is a colloquial form of language that is used to convey messages in a non-literal way, which affects the performance of many NLP tasks. Sarcasm detection is not trivial and existing work mainly focus on only English. We present SarcasTürk, a context-aware Turkish sarcasm detection dataset built from Ekşi Sözlük entries, a large-scale Turkish online discussion platform where people frequently use sarcasm. SarcasTürk contains 1,515 entries from 98 titles with binary sarcasm labels and a title-level context field created to support comparisons between entry-only and context-aware models. We generate these contexts by selecting representative sentences from all entries under a title using summarization techniques. We report baseline results for a fine-tuned BERTurk classifier and zero-shot LLMs under both no-context and context-aware conditions. We find that BERTurk model with title-level context has the best performance with 0.76 accuracy and balanced class-wise F1 scores (0.77 for sarcasm, 0.75 for no sarcasm). SarcasTürk can be shared upon contacting the authors since the dataset contains potentially sensitive and offensive language.- Anthology ID:
- 2026.sigturk-1.6
- Volume:
- Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Kemal Oflazer, Abdullatif Köksal, Onur Varol
- Venues:
- SIGTURK | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 61–71
- Language:
- URL:
- https://preview.aclanthology.org/manual-author-scripts/2026.sigturk-1.6/
- DOI:
- Cite (ACL):
- Niyazi Ahmet Metin, Sevde Yılmaz, Osman Enes Erdoğdu, Elif Sude Meydan, Oğul Sümer, and Dilara Keküllüoğlu. 2026. SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset. In Proceedings of the Second Workshop Natural Language Processing for Turkic Languages (SIGTURK 2026), pages 61–71, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- SarcasTürk: Turkish Context-Aware Sarcasm Detection Dataset (Metin et al., SIGTURK 2026)
- PDF:
- https://preview.aclanthology.org/manual-author-scripts/2026.sigturk-1.6.pdf