Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data

Navyansh Singh

Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data

Abstract

Logical fallacy detection models frequentlyover-flag valid reasoning due to reliance onsurface-level spurious correlations. We in-troduce 703 LLM-generated CounterfactuallyAugmented Data (CAD) pairs—minimally dif-ferentiated valid and fallacious arguments—todebias models through targeted augmentation.Fine-tuning DeBERTa-v3-large on CoCoLoFaaugmented with these pairs yields marginalin-distribution improvement (+0.4% F1) butsubstantial out-of-distribution robustness: 58%relative reduction in false positive rate (64%→ 26.7%) on a 300-sample Reddit-sourcedevaluation set. While recent LLMs (Llama-3.1-8B, Llama-3.3-70B) achieve high perfor-mance under optimized prompts (F1 90–94%),they degrade severely under simple human-like prompts (F1 63–72%, FPR 54–74%).Our lightweight, prompt-invariant approachachieves competitive robustness (F1 85.9%,FPR 26.7%) across all prompting regimes with-out prompt engineering, making it stable forproduction deployment with unpredictable userinput. The dataset and model are publicly re-leased.

Anthology ID:: 2026.acl-srw.30
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 363–374
Language:
URL:: https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.30/
DOI:
Bibkey:
Cite (ACL):: Navyansh Singh. 2026. Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 363–374, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data (Singh, ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.30.pdf

PDF Cite Search Fix data