Muhammad Quddussi Kashaf


2026

Modern media poses a complex challenge to verifying the credibility of information and public discourse due to the advent of conspiracy theory content. This paper presents our methodology in "SemEval-2026 Task 10: Psycholinguistic Conspiracy Marker Extraction and Detection". It consists of two subtasks: extracting psycholinguistic markers from text using Named Entity Recognition (NER) techniques, and classifying Reddit comments as conspiratorial or non-conspiratorial. Our approach involved: (1) diverse extraction methodologies, including traditional bio tagging schemes, the GlobalPointer framework, and the GLiNER2 architecture, (2) data augmentation and synthetic data generation via Large Language Models (LLMs), and (3) evaluating various transformer-based models, such as DistilBERT and Covid Twitter-BERT. Our final system achieves a macro F1 score of 0.26 on Subtask 1 and 0.76 on Subtask 2.