A Corpus of Very Short Scientific Summaries

Yifan Chen, Tamara Polajnar, Colin Batchelor, Simone Teufel


Abstract
We present a new summarisation task, taking scientific articles and producing journal table-of-contents entries in the chemistry domain. These are one- or two-sentence author-written summaries that present the key findings of a paper. This is a first look at this summarisation task with an open access publication corpus consisting of titles and abstracts, as input texts, and short author-written advertising blurbs, as the ground truth. We introduce the dataset and evaluate it with state-of-the-art summarisation methods.
Anthology ID:
2020.conll-1.12
Volume:
Proceedings of the 24th Conference on Computational Natural Language Learning
Month:
November
Year:
2020
Address:
Online
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
153–164
Language:
URL:
https://aclanthology.org/2020.conll-1.12
DOI:
10.18653/v1/2020.conll-1.12
Bibkey:
Cite (ACL):
Yifan Chen, Tamara Polajnar, Colin Batchelor, and Simone Teufel. 2020. A Corpus of Very Short Scientific Summaries. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 153–164, Online. Association for Computational Linguistics.
Cite (Informal):
A Corpus of Very Short Scientific Summaries (Chen et al., CoNLL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2020.conll-1.12.pdf
Code
 atulkum/pointer_summarizer