Comparing Language Models of Different Scales for Security-Focused Tabular Query Generation and Reasoning

Varivashya Poladi; Sandipan Dandapat

Comparing Language Models of Different Scales for Security-Focused Tabular Query Generation and Reasoning

Abstract

Security-related data often exists in complex, multi-table formats and is scarce due to privacy and compliance constraints—posing a major challenge for training and evaluating language models (LMs) on security reasoning tasks. In this work, we systematically investigate the performance of large language models (LLMs) across different parameter scales in generating and solving multi-step, semantically rich queries over realistic security scenarios represented through three interlinked tabular datasets. We assess models on three key axes (i) their ability to formulate insightful, high complexity security questions; (ii) the quality and coherence of their reasoning chains; and (iii) their accuracy in deriving actionable answers from the underlying data. To address data scarcity, we propose a diffusion-based synthetic data generation pipeline that amplifies the existing dataset while preserving domain semantics and statistical structure. Our findings reveal that while large models often outperform in reasoning depth and query formulation, smaller models show surprising efficiency and accuracy. The study provides actionable insights for deploying generative models in security analytics and opens avenues for synthetic data-driven evaluation of LLMs in low-resource, high-stakes domains.

Anthology ID:: 2025.ijcnlp-long.55
Volume:: Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venues:: IJCNLP | AACL
SIG:
Publisher:: The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:: 1002–1016
Language:
URL:: https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.55/
DOI:
Bibkey:
Cite (ACL):: Varivashya Poladi and Sandipan Dandapat. 2025. Comparing Language Models of Different Scales for Security-Focused Tabular Query Generation and Reasoning. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 1002–1016, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):: Comparing Language Models of Different Scales for Security-Focused Tabular Query Generation and Reasoning (Poladi & Dandapat, IJCNLP-AACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-long.55.pdf

PDF Cite Search Fix data