2018
pdf
abs
Adapting Neural Single-Document Summarization Model for Abstractive Multi-Document Summarization: A Pilot Study
Jianmin Zhang
|
Jiwei Tan
|
Xiaojun Wan
Proceedings of the 11th International Conference on Natural Language Generation
Till now, neural abstractive summarization methods have achieved great success for single document summarization (SDS). However, due to the lack of large scale multi-document summaries, such methods can be hardly applied to multi-document summarization (MDS). In this paper, we investigate neural abstractive methods for MDS by adapting a state-of-the-art neural abstractive summarization model for SDS. We propose an approach to extend the neural abstractive model trained on large scale SDS data to the MDS task. Our approach only makes use of a small number of multi-document summaries for fine tuning. Experimental results on two benchmark DUC datasets demonstrate that our approach can outperform a variety of baseline neural models.
2017
pdf
abs
Towards Automatic Construction of News Overview Articles by News Synthesis
Jianmin Zhang
|
Xiaojun Wan
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
In this paper we investigate a new task of automatically constructing an overview article from a given set of news articles about a news event. We propose a news synthesis approach to address this task based on passage segmentation, ranking, selection and merging. Our proposed approach is compared with several typical multi-document summarization methods on the Wikinews dataset, and achieves the best performance on both automatic evaluation and manual evaluation.
pdf
abs
Content Selection for Real-time Sports News Construction from Commentary Texts
Jin-ge Yao
|
Jianmin Zhang
|
Xiaojun Wan
|
Jianguo Xiao
Proceedings of the 10th International Conference on Natural Language Generation
We study the task of constructing sports news report automatically from live commentary and focus on content selection. Rather than receiving every piece of text of a sports match before news construction, as in previous related work, we novelly verify the feasibility of a more challenging but more useful setting to generate news report on the fly by treating live text input as a stream. Specifically, we design various scoring functions to address different requirements of the task. The near submodularity of scoring functions makes it possible to adapt efficient greedy algorithms even in stream data settings. Experiments suggest that our proposed framework can already produce comparable results compared with previous work that relies on a supervised learning-to-rank model with heavy feature engineering.
2016
pdf
abs
PKUSUMSUM : A Java Platform for Multilingual Document Summarization
Jianmin Zhang
|
Tianming Wang
|
Xiaojun Wan
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations
PKUSUMSUM is a Java platform for multilingual document summarization, and it sup-ports multiple languages, integrates 10 automatic summarization methods, and tackles three typical summarization tasks. The summarization platform has been released and users can easily use and update it. In this paper, we make a brief description of the char-acteristics, the summarization methods, and the evaluation results of the platform, and al-so compare PKUSUMSUM with other summarization toolkits.
pdf
Towards Constructing Sports News from Live Text Commentary
Jianmin Zhang
|
Jin-ge Yao
|
Xiaojun Wan
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)