AFT-Tab: Adversarial Fine-Tuning for Tabular Data Synthesis with Long Text Columns

Yuhao Zhang, Liang Yan, Shaoming Duan, Xinyu Zha, Jinhang Su, Peiyi Han, Chuanyi Liu


Abstract
Traditional tabular data synthesis methods often overlook the cross-modal heterogeneity of real-world tables, where structured continuous and discrete attributes coexist with unstructured long-text columns. Existing synthesis approaches struggle to simultaneously achieve accurate statistical fidelity for non-textual attributes and consistent semantic constraints between textual and non-textual attributes. In this work, we establish the first benchmark for long-text tabular data synthesis and introduce a novel metric, termed Textual Column Correlation Fidelity (TCCF), to quantify cross-modal semantic alignment. We propose AFT-Tab, an adversarial fine-tuning framework that synergistically trains an LLM-based text generator and a deep-learning-based non-textual generator. Through a dual-feedback mechanism guided by an LLM discriminator, AFT-Tab ensures both precise statistical distributions and rigorous semantic constraints. Experimental results show that AFT-Tab significantly outperforms state-of-the-art baselines in statistical fidelity, TCCF, diversity, and downstream task utility.
Anthology ID:
2026.acl-long.209
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4581–4594
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.209/
DOI:
Bibkey:
Cite (ACL):
Yuhao Zhang, Liang Yan, Shaoming Duan, Xinyu Zha, Jinhang Su, Peiyi Han, and Chuanyi Liu. 2026. AFT-Tab: Adversarial Fine-Tuning for Tabular Data Synthesis with Long Text Columns. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4581–4594, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
AFT-Tab: Adversarial Fine-Tuning for Tabular Data Synthesis with Long Text Columns (Zhang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.209.pdf
Checklist:
 2026.acl-long.209.checklist.pdf