Extracting Discriminative Keyphrases with Learned Semantic Hierarchies

Yunli Wang, Yong Jin, Xiaodan Zhu, Cyril Goutte


Abstract
The goal of keyphrase extraction is to automatically identify the most salient phrases from documents. The technique has a wide range of applications such as rendering a quick glimpse of a document, or extracting key content for further use. While previous work often assumes keyphrases are a static property of a given documents, in many applications, the appropriate set of keyphrases that should be extracted depends on the set of documents that are being considered together. In particular, good keyphrases should not only accurately describe the content of a document, but also reveal what discriminates it from the other documents. In this paper, we study this problem of extracting discriminative keyphrases. In particularly, we propose to use the hierarchical semantic structure between candidate keyphrases to promote keyphrases that have the right level of specificity to clearly distinguish the target document from others. We show that such knowledge can be used to construct better discriminative keyphrase extraction systems that do not assume a static, fixed set of keyphrases for a document. We show how this helps identify key expertise of authors from their papers, as well as competencies covered by online courses within different domains.
Anthology ID:
C16-1089
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
932–942
Language:
URL:
https://aclanthology.org/C16-1089
DOI:
Bibkey:
Cite (ACL):
Yunli Wang, Yong Jin, Xiaodan Zhu, and Cyril Goutte. 2016. Extracting Discriminative Keyphrases with Learned Semantic Hierarchies. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 932–942, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Extracting Discriminative Keyphrases with Learned Semantic Hierarchies (Wang et al., COLING 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/nodalida-main-page/C16-1089.pdf