Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data

Navyansh Singh


Abstract
Logical fallacy detection models frequentlyover-flag valid reasoning due to reliance onsurface-level spurious correlations. We in-troduce 703 LLM-generated CounterfactuallyAugmented Data (CAD) pairs—minimally dif-ferentiated valid and fallacious arguments—todebias models through targeted augmentation.Fine-tuning DeBERTa-v3-large on CoCoLoFaaugmented with these pairs yields marginalin-distribution improvement (+0.4% F1) butsubstantial out-of-distribution robustness: 58%relative reduction in false positive rate (64%→ 26.7%) on a 300-sample Reddit-sourcedevaluation set. While recent LLMs (Llama-3.1-8B, Llama-3.3-70B) achieve high perfor-mance under optimized prompts (F1 90–94%),they degrade severely under simple human-like prompts (F1 63–72%, FPR 54–74%).Our lightweight, prompt-invariant approachachieves competitive robustness (F1 85.9%,FPR 26.7%) across all prompting regimes with-out prompt engineering, making it stable forproduction deployment with unpredictable userinput. The dataset and model are publicly re-leased.
Anthology ID:
2026.acl-srw.30
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
363–374
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.30/
DOI:
Bibkey:
Cite (ACL):
Navyansh Singh. 2026. Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 363–374, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data (Singh, ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-srw.30.pdf