Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data

Navyansh Singh


Abstract
Logical fallacy detection models frequentlyover-flag valid reasoning due to reliance onsurface-level spurious correlations. We in-troduce 703 LLM-generated CounterfactuallyAugmented Data (CAD) pairs—minimally dif-ferentiated valid and fallacious arguments—todebias models through targeted augmentation.Fine-tuning DeBERTa-v3-large on CoCoLoFaaugmented with these pairs yields marginalin-distribution improvement (+0.4% F1) butsubstantial out-of-distribution robustness: 58%relative reduction in false positive rate (64%→ 26.7%) on a 300-sample Reddit-sourcedevaluation set. While recent LLMs (Llama-3.1-8B, Llama-3.3-70B) achieve high perfor-mance under optimized prompts (F1 90–94%),they degrade severely under simple human-like prompts (F1 63–72%, FPR 54–74%).Our lightweight, prompt-invariant approachachieves competitive robustness (F1 85.9%,FPR 26.7%) across all prompting regimes with-out prompt engineering, making it stable forproduction deployment with unpredictable userinput. The dataset and model are publicly re-leased.
Anthology ID:
2026.acl-srw.30
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
363–374
Language:
URL:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.30/
DOI:
Bibkey:
Cite (ACL):
Navyansh Singh. 2026. Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 363–374, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data (Singh, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-form-platform/2026.acl-srw.30.pdf