Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios?
Yuyan Chen, Yueze Li, Songzhou Yan, Sijia Liu, Jiaqing Liang, Yanghua Xiao
Abstract
The evaluation of the problem-solving capability under incomplete information scenarios of Large Language Models (LLMs) is increasingly important, encompassing capabilities such as questioning, knowledge search, error detection, and path planning. Current research mainly focus on LLMs’ problem-solving capability such as “Twenty Questions”.However, these kinds of games do not require recognizing misleading cues which are necessary in the incomplete information scenario.Moreover, the existing game such as “Who is undercover” are highly subjective, making it challenging for evaluation.Therefore, in this paper, we introduce a novel game named BrainKing based on the “Who is undercover” and “Twenty Questions” for evaluating LLM capabilities under incomplete information scenarios. It requires LLMs to identify target entities with limited yes-or-no questions and potential misleading answers. By setting up easy, medium, and hard difficulty modes, we comprehensively assess the performance of LLMs across various aspects. Our results reveal the capabilities and limitations of LLMs in BrainKing, providing significant insights of LLM problem-solving levels.- Anthology ID:
- 2024.findings-acl.131
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2225–2238
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-acl.131/
- DOI:
- 10.18653/v1/2024.findings-acl.131
- Cite (ACL):
- Yuyan Chen, Yueze Li, Songzhou Yan, Sijia Liu, Jiaqing Liang, and Yanghua Xiao. 2024. Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios?. In Findings of the Association for Computational Linguistics: ACL 2024, pages 2225–2238, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios? (Chen et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-acl.131.pdf