Abstract
To keep pace with the increased generation and digitization of documents, automated methods that can improve search, discovery and mining of the vast body of literature are essential. Keyphrases provide a concise representation by identifying salient concepts in a document. Various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts. Unfortunately, this method fails for short documents where the context is unclear. Moreover, keyphrases, which are usually the gist of a document, need to be the central theme. We propose a new extraction model that introduces a centrality constraint to enrich the word representation of a Bidirectional long short-term memory. Performance evaluation on 2 publicly available datasets demonstrate our model outperforms existing state-of-the art approaches.- Anthology ID:
- 2021.bionlp-1.17
- Volume:
- Proceedings of the 20th Workshop on Biomedical Language Processing
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Venue:
- BioNLP
- SIG:
- SIGBIOMED
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 155–161
- Language:
- URL:
- https://aclanthology.org/2021.bionlp-1.17
- DOI:
- 10.18653/v1/2021.bionlp-1.17
- Cite (ACL):
- Zelalem Gero and Joyce Ho. 2021. Word centrality constrained representation for keyphrase extraction. In Proceedings of the 20th Workshop on Biomedical Language Processing, pages 155–161, Online. Association for Computational Linguistics.
- Cite (Informal):
- Word centrality constrained representation for keyphrase extraction (Gero & Ho, BioNLP 2021)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2021.bionlp-1.17.pdf
- Code
- zhgero/keyphrases_centrality