Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios?

Yuyan Chen, Yueze Li, Songzhou Yan, Sijia Liu, Jiaqing Liang, Yanghua Xiao


Abstract
The evaluation of the problem-solving capability under incomplete information scenarios of Large Language Models (LLMs) is increasingly important, encompassing capabilities such as questioning, knowledge search, error detection, and path planning. Current research mainly focus on LLMs’ problem-solving capability such as “Twenty Questions”.However, these kinds of games do not require recognizing misleading cues which are necessary in the incomplete information scenario.Moreover, the existing game such as “Who is undercover” are highly subjective, making it challenging for evaluation.Therefore, in this paper, we introduce a novel game named BrainKing based on the “Who is undercover” and “Twenty Questions” for evaluating LLM capabilities under incomplete information scenarios. It requires LLMs to identify target entities with limited yes-or-no questions and potential misleading answers. By setting up easy, medium, and hard difficulty modes, we comprehensively assess the performance of LLMs across various aspects. Our results reveal the capabilities and limitations of LLMs in BrainKing, providing significant insights of LLM problem-solving levels.
Anthology ID:
2024.findings-acl.131
Volume:
Findings of the Association for Computational Linguistics: ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2225–2238
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-acl.131/
DOI:
10.18653/v1/2024.findings-acl.131
Bibkey:
Cite (ACL):
Yuyan Chen, Yueze Li, Songzhou Yan, Sijia Liu, Jiaqing Liang, and Yanghua Xiao. 2024. Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios?. In Findings of the Association for Computational Linguistics: ACL 2024, pages 2225–2238, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios? (Chen et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-acl.131.pdf