Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios

Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang, Arman Cohan


Abstract
Tabular data is prevalent across various industries, necessitating significant time and effort for users to understand and manipulate for their information-seeking purposes. The advancements in large language models (LLMs) have shown enormous potential to improve user efficiency. However, the adoption of LLMs in real-world applications for table information seeking remains underexplored. In this paper, we investigate the table-to-text capabilities of different LLMs using four datasets within two real-world information seeking scenarios. These include the LogicNLG and our newly-constructed LoTNLG datasets for data insight generation, along with the FeTaQA and our newly-constructed F2WTQ datasets for query-based generation. We structure our investigation around three research questions, evaluating the performance of LLMs in table-to-text generation, automated evaluation, and feedback generation, respectively. Experimental results indicate that the current high-performing LLM, specifically GPT-4, can effectively serve as a table-to-text generator, evaluator, and feedback generator, facilitating users’ information seeking purposes in real-world scenarios. However, a significant performance gap still exists between other open-sourced LLMs (e.g., Vicuna and LLaMA-2) and GPT-4 models. Our data and code are publicly available at https://github.com/yale-nlp/LLM-T2T.
Anthology ID:
2023.emnlp-industry.17
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2023
Address:
Singapore
Editors:
Mingxuan Wang, Imed Zitouni
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
160–175
Language:
URL:
https://aclanthology.org/2023.emnlp-industry.17
DOI:
10.18653/v1/2023.emnlp-industry.17
Bibkey:
Cite (ACL):
Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang, and Arman Cohan. 2023. Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 160–175, Singapore. Association for Computational Linguistics.
Cite (Informal):
Investigating Table-to-Text Generation Capabilities of Large Language Models in Real-World Information Seeking Scenarios (Zhao et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.emnlp-industry.17.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-5/2023.emnlp-industry.17.mp4