Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language

Timo Johner; Abhik Jana; Chris Biemann

Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language

Abstract

Recent research using pre-trained language models for multi-document summarization task lacks deep investigation of potential erroneous cases and their possible application on other languages. In this work, we apply a pre-trained language model (BART) for multi-document summarization (MDS) task using both fine-tuning and without fine-tuning. We use two English datasets and one German dataset for this study. First, we reproduce the multi-document summaries for English language by following one of the recent studies. Next, we show the applicability of the model to German language by achieving state-of-the-art performance on German MDS. We perform an in-depth error analysis of the followed approach for both languages, which leads us to identifying most notable errors, from made-up facts and topic delimitation, and quantifying the amount of extractiveness.

Anthology ID:: 2021.nodalida-main.43
Volume:: Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
Month:: May 31--2 June
Year:: 2021
Address:: Reykjavik, Iceland (Online)
Editors:: Simon Dobnik, Lilja Øvrelid
Venue:: NoDaLiDa
SIG:
Publisher:: Linköping University Electronic Press, Sweden
Note:
Pages:: 391–397
Language:
URL:: https://aclanthology.org/2021.nodalida-main.43
DOI:
Bibkey:
Cite (ACL):: Timo Johner, Abhik Jana, and Chris Biemann. 2021. Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pages 391–397, Reykjavik, Iceland (Online). Linköping University Electronic Press, Sweden.
Cite (Informal):: Error Analysis of using BART for Multi-Document Summarization: A Study for English and German Language (Johner et al., NoDaLiDa 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/ml4al-ingestion/2021.nodalida-main.43.pdf
Code: uhh-lt/multi-summ-german
Data: CNN/Daily Mail

PDF Search Code