MAVEN-FACT: A Large-scale Event Factuality Detection Dataset

Chunyang Li, Hao Peng, Xiaozhi Wang, Yunjia Qi, Lei Hou, Bin Xu, Juanzi Li


Abstract
Event Factuality Detection (EFD) task determines the factuality of textual events, i.e., classifying whether an event is a fact, possibility, or impossibility, which is essential for faithfully understanding and utilizing event knowledge. However, due to the lack of high-quality large-scale data, event factuality detection is under-explored in event understanding research, which limits the development of EFD community. To address these issues and provide faithful event understanding, we introduce MAVEN-FACT, a large-scale and high-quality EFD dataset based on the MAVEN dataset. MAVEN-FACT includes factuality annotations of 112,276 events, making it the largest EFD dataset. Extensive experiments demonstrate that MAVEN-FACT is challenging for both conventional fine-tuned models and large language models (LLMs). Thanks to the comprehensive annotations of event arguments and relations in MAVEN, MAVEN-FACT also supports some further analyses and we find that adopting event arguments and relations helps in event factuality detection for fine-tuned models but does not benefit LLMs. Furthermore, we preliminarily study an application case of event factuality detection and find it helps in mitigating event-related hallucination in LLMs. We will release our dataset and codes to facilitate further research on event factuality detection.
Anthology ID:
2024.findings-emnlp.651
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11140–11158
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.651/
DOI:
10.18653/v1/2024.findings-emnlp.651
Bibkey:
Cite (ACL):
Chunyang Li, Hao Peng, Xiaozhi Wang, Yunjia Qi, Lei Hou, Bin Xu, and Juanzi Li. 2024. MAVEN-FACT: A Large-scale Event Factuality Detection Dataset. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 11140–11158, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
MAVEN-FACT: A Large-scale Event Factuality Detection Dataset (Li et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.651.pdf
Software:
 2024.findings-emnlp.651.software.zip