Abstract
This system description paper details TEAM UFAL’s approach for the SummScreen, TVMegasite subtask of the CreativeSumm shared task. The subtask deals with creating summaries for dialogues from TV Soap operas. We utilized BART based pre-trained model fine-tuned on SamSum dialouge summarization dataset. Few examples from AutoMin dataset and the dataset provided by the organizers were also inserted into the data as a few-shot learning objective. The additional data was manually broken into chunks based on different boundaries in summary and the dialogue file. For inference we choose a similar strategy as the top-performing team at AutoMin 2021, where the data is split into chunks, either on [SCENE_CHANGE] or exceeding a pre-defined token length, to accommodate the maximum token possible in the pre-trained model for one example. The final training strategy was chosen based on how natural the responses looked instead of how well the model performed on an automated evaluation metrics such as ROGUE.- Anthology ID:
- 2022.creativesumm-1.4
- Volume:
- Proceedings of The Workshop on Automatic Summarization for Creative Writing
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editor:
- Kathleen Mckeown
- Venue:
- CreativeSumm
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 24–28
- Language:
- URL:
- https://aclanthology.org/2022.creativesumm-1.4
- DOI:
- Cite (ACL):
- Rishu Kumar and Rudolf Rosa. 2022. TEAM UFAL @ CreativeSumm 2022: BART and SamSum based few-shot approach for creative Summarization. In Proceedings of The Workshop on Automatic Summarization for Creative Writing, pages 24–28, Gyeongju, Republic of Korea. Association for Computational Linguistics.
- Cite (Informal):
- TEAM UFAL @ CreativeSumm 2022: BART and SamSum based few-shot approach for creative Summarization (Kumar & Rosa, CreativeSumm 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2022.creativesumm-1.4.pdf
- Data
- SummScreen