Abstract
Information Extraction (IE), aiming to extract structured information from unstructured natural language texts, can significantly benefit from pre-trained language models. However, existing pre-training methods solely focus on exploiting the textual knowledge, relying extensively on annotated large-scale datasets, which is labor-intensive and thus limits the scalability and versatility of the resulting models. To address these issues, we propose SKIE, a novel pre-training framework tailored for IE that integrates structural semantic knowledge via contrastive learning, effectively alleviating the annotation burden. Specifically, SKIE utilizes Abstract Meaning Representation (AMR) as a low-cost supervision source to boost model performance without human intervention. By enhancing the topology of AMR graphs, SKIE derives high-quality cohesive subgraphs as additional training samples, providing diverse multi-level structural semantic knowledge. Furthermore, SKIE refines the graph encoder to better capture cohesive information and edge relation information, thereby improving the pre-training efficacy. Extensive experimental results demonstrate that SKIE outperforms state-of-the-art baselines across multiple IE tasks and showcases exceptional performance in few-shot and zero-shot settings.- Anthology ID:
- 2024.emnlp-main.129
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2156–2171
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.emnlp-main.129/
- DOI:
- 10.18653/v1/2024.emnlp-main.129
- Cite (ACL):
- Xiaoyang Yi, Yuru Bao, Jian Zhang, Yifang Qin, and Faxin Lin. 2024. Integrating Structural Semantic Knowledge for Enhanced Information Extraction Pre-training. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 2156–2171, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Integrating Structural Semantic Knowledge for Enhanced Information Extraction Pre-training (Yi et al., EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.emnlp-main.129.pdf