Abstract
Keyphrase extraction aims to extract a set of phrases with the central idea of the source document. In a structured document, there are certain locations (e.g., the title or the first sentence) where a keyphrase is most likely to appear. However, when extracting keyphrases from the document, most existing embedding-based unsupervised keyphrase extraction models ignore the indicative role of the highlights in certain locations, leading to wrong keyphrases extraction. In this paper, we propose a new Highlight-Guided Unsupervised Keyphrase Extraction model (HGUKE) to address the above issue. Specifically, HGUKE first models the phrase-document relevance via the highlights of the documents. Next, HGUKE calculates the cross-phrase relevance between all candidate phrases. Finally, HGUKE aggregates the above two relevance as the importance score of each candidate phrase to rank and extract keyphrases. The experimental results on three benchmarks demonstrate that HGUKE outperforms the state-of-the-art unsupervised keyphrase extraction baselines.- Anthology ID:
- 2023.findings-acl.66
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2023
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1041–1048
- Language:
- URL:
- https://aclanthology.org/2023.findings-acl.66
- DOI:
- 10.18653/v1/2023.findings-acl.66
- Cite (ACL):
- Mingyang Song, Huafeng Liu, Yi Feng, and Liping Jing. 2023. Improving Embedding-based Unsupervised Keyphrase Extraction by Incorporating Structural Information. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1041–1048, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Improving Embedding-based Unsupervised Keyphrase Extraction by Incorporating Structural Information (Song et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2023.findings-acl.66.pdf