Aayush Kumar

2026

TEN: Table Explicitization, Neurosymbolically
Nikita Mehrotra | Aayush Kumar | Sumit Gulwani | Arjun Radhakrishna | Ashish Tiwari
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)

We present TEN, a neurosymbolic approach for extracting tabular data from semistructured text such as copy-pasted content from PDFs, emails, or OCR-flattened outputs. This task poses real-world challenges in domains like finance and healthcare, where manual copy-paste into spreadsheets introduces errors and OCR distortions compromise data integrity, leading to financial losses and flawed decisions.Purely neural methods suffer from hallucinations and structural inconsistencies, hindering deployment robustness. TEN addresses this via a novel triadic feedback loop that iteratively refines table hypotheses to enforce constraints and achieve verifiable convergence.Experiments show TEN outperforms neural baselines in exact match accuracy and lower hallucination rates. A 21-participant user study rates TEN tables more accurate and preferred in over 60% of pairwise comparisons, though verification and correction effort did not differ significantly between conditions.

Co-authors

Venues

ACL1

Fix author