Extracting Financial Events from Raw Texts via Matrix Chunking

Yusheng Huang, Ning Hu, Kunping Li, Nan Wang, Zhouhan Lin


Abstract
Event Extraction (EE) is widely used in the Chinese financial field to provide valuable structured information. However, there are two key challenges for Chinese financial EE in application scenarios. First, events need to be extracted from raw texts, which sets it apart from previous works like the Automatic Content Extraction (ACE) EE task, where EE is treated as a classification problem given the entity spans. Second, recognizing financial entities can be laborious, as they may involve multiple elements. In this paper, we introduce CFTE, a novel task for Chinese Financial Text-to-Event extraction, which directly extracts financial events from raw texts. We further present FINEED, a Chinese FINancial Event Extraction Dataset, and an efficient MAtrix-ChunKing method called MACK, designed for the extraction of financial events from raw texts. Specifically, FINEED is manually annotated with rich linguistic features. We propose a novel two-dimensional annotation method for FINEED, which can visualize the interactions among text components. Our MACK method is fault-tolerant by preserving the tag frequency distribution when identifying financial entities. We conduct extensive experiments and the results verify the effectiveness of our MACK method.
Anthology ID:
2024.lrec-main.617
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
7035–7044
Language:
URL:
https://aclanthology.org/2024.lrec-main.617
DOI:
Bibkey:
Cite (ACL):
Yusheng Huang, Ning Hu, Kunping Li, Nan Wang, and Zhouhan Lin. 2024. Extracting Financial Events from Raw Texts via Matrix Chunking. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7035–7044, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Extracting Financial Events from Raw Texts via Matrix Chunking (Huang et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2024.lrec-main.617.pdf