Yusheng Huang


2024

pdf
Extracting Financial Events from Raw Texts via Matrix Chunking
Yusheng Huang | Ning Hu | Kunping Li | Nan Wang | Zhouhan Lin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Event Extraction (EE) is widely used in the Chinese financial field to provide valuable structured information. However, there are two key challenges for Chinese financial EE in application scenarios. First, events need to be extracted from raw texts, which sets it apart from previous works like the Automatic Content Extraction (ACE) EE task, where EE is treated as a classification problem given the entity spans. Second, recognizing financial entities can be laborious, as they may involve multiple elements. In this paper, we introduce CFTE, a novel task for Chinese Financial Text-to-Event extraction, which directly extracts financial events from raw texts. We further present FINEED, a Chinese FINancial Event Extraction Dataset, and an efficient MAtrix-ChunKing method called MACK, designed for the extraction of financial events from raw texts. Specifically, FINEED is manually annotated with rich linguistic features. We propose a novel two-dimensional annotation method for FINEED, which can visualize the interactions among text components. Our MACK method is fault-tolerant by preserving the tag frequency distribution when identifying financial entities. We conduct extensive experiments and the results verify the effectiveness of our MACK method.

2021

pdf
Exploring Sentence Community for Document-Level Event Extraction
Yusheng Huang | Weijia Jia
Findings of the Association for Computational Linguistics: EMNLP 2021

Document-level event extraction is critical to various natural language processing tasks for providing structured information. Existing approaches by sequential modeling neglect the complex logic structures for long texts. In this paper, we leverage the entity interactions and sentence interactions within long documents and transform each document into an undirected unweighted graph by exploiting the relationship between sentences. We introduce the Sentence Community to represent each event as a subgraph. Furthermore, our framework SCDEE maintains the ability to extract multiple events by sentence community detection using graph attention networks and alleviate the role overlapping issue by predicting arguments in terms of roles. Experiments demonstrate that our framework achieves competitive results over state-of-the-art methods on the large-scale document-level event extraction dataset.