@inproceedings{wang-etal-2024-fac2e,
    title = "{FAC}$^2${E}: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition",
    author = "Wang, Xiaoqiang  and
      Wu, Lingfei  and
      Ma, Tengfei  and
      Liu, Bang",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.emnlp-main.734/",
    doi = "10.18653/v1/2024.emnlp-main.734",
    pages = "13228--13243",
    abstract = "Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. However, such a paradigm fails to comprehensively differentiate the fine-grained language and cognitive skills, rendering the lack of sufficient interpretation to LLMs' capabilities. In this paper, we present FAC$^2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation. Specifically, we formulate LLMs' evaluation in a multi-dimensional and explainable manner by dissociating the language-related capabilities and the cognition-related ones. Besides, through extracting the intermediate reasoning from LLMs, we further break down the process of applying a specific capability into three sub-steps: recalling relevant knowledge, utilizing knowledge, and solving problems. Finally, FAC$^2$E evaluates each sub-step of each fine-grained capability, providing a two-faceted diagnosis for LLMs. Utilizing FAC$^2$E, we identify a common shortfall in knowledge utilization among models and propose a straightforward, knowledge-enhanced method to mitigate this issue. Our results not only showcase promising performance enhancements but also highlight a direction for future LLM advancements."
}Markdown (Informal)
[FAC2E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition](https://preview.aclanthology.org/ingest-emnlp/2024.emnlp-main.734/) (Wang et al., EMNLP 2024)
ACL