Supervised Machine Learning for Extractive Query Based Summarisation of Biomedical Data

Mandeep Kaur, Diego Mollá


Abstract
The automation of text summarisation of biomedical publications is a pressing need due to the plethora of information available online. This paper explores the impact of several supervised machine learning approaches for extracting multi-document summaries for given queries. In particular, we compare classification and regression approaches for query-based extractive summarisation using data provided by the BioASQ Challenge. We tackled the problem of annotating sentences for training classification systems and show that a simple annotation approach outperforms regression-based summarisation.
Anthology ID:
W18-5604
Volume:
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis
Month:
October
Year:
2018
Address:
Brussels, Belgium
Venue:
Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29–37
Language:
URL:
https://aclanthology.org/W18-5604
DOI:
10.18653/v1/W18-5604
Bibkey:
Cite (ACL):
Mandeep Kaur and Diego Mollá. 2018. Supervised Machine Learning for Extractive Query Based Summarisation of Biomedical Data. In Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, pages 29–37, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Supervised Machine Learning for Extractive Query Based Summarisation of Biomedical Data (Kaur & Mollá, Louhi 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W18-5604.pdf
Data
BioASQ