CoBA: Counterbias Text Augmentation for Mitigating Various Spurious Correlations via Semantic Triples

Kyohoon Jin, Juhwan Choi, JungMin Yun, Junho Lee, Soojin Jang, YoungBin Kim


Abstract
Deep learning models often learn and exploit spurious correlations in training data, using these non-target features to inform their predictions. Such reliance leads to performance degradation and poor generalization on unseen data. To address these limitations, we introduce a more general form of counterfactual data augmentation, termed *counterbias* data augmentation, which simultaneously tackles multiple biases (e.g., gender bias, simplicity bias) and enhances out-of-distribution robustness. We present **CoBA**: **Co**unter**B**ias **A**ugmentation, a unified framework that operates at the semantic triple level: first decomposing text into subject-predicate-object triples, then selectively modifying these triples to disrupt spurious correlations. By reconstructing the text from these adjusted triples, **CoBA** generates *counterbias* data that mitigates spurious patterns. Through extensive experiments, we demonstrate that **CoBA** not only improves downstream task performance, but also effectively reduces biases and strengthens out-of-distribution resilience, offering a versatile and robust solution to the challenges posed by spurious correlations.
Anthology ID:
2025.emnlp-main.520
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10271–10289
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.520/
DOI:
Bibkey:
Cite (ACL):
Kyohoon Jin, Juhwan Choi, JungMin Yun, Junho Lee, Soojin Jang, and YoungBin Kim. 2025. CoBA: Counterbias Text Augmentation for Mitigating Various Spurious Correlations via Semantic Triples. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 10271–10289, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
CoBA: Counterbias Text Augmentation for Mitigating Various Spurious Correlations via Semantic Triples (Jin et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.520.pdf
Checklist:
 2025.emnlp-main.520.checklist.pdf