A Generative Approach to Titling and Clustering Wikipedia Sections

Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, Abe Ittycheriah

[How to correct problems with metadata yourself]


Abstract
We evaluate the performance of transformer encoders with various decoders for information organization through a new task: generation of section headings for Wikipedia articles. Our analysis shows that decoders containing attention mechanisms over the encoder output achieve high-scoring results by generating extractive text. In contrast, a decoder without attention better facilitates semantic encoding and can be used to generate section embeddings. We additionally introduce a new loss function, which further encourages the decoder to generate high-quality embeddings.
Anthology ID:
2020.ngt-1.9
Volume:
Proceedings of the Fourth Workshop on Neural Generation and Translation
Month:
July
Year:
2020
Address:
Online
Editors:
Alexandra Birch, Andrew Finch, Hiroaki Hayashi, Kenneth Heafield, Marcin Junczys-Dowmunt, Ioannis Konstas, Xian Li, Graham Neubig, Yusuke Oda
Venue:
NGT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
79–87
Language:
URL:
https://aclanthology.org/2020.ngt-1.9
DOI:
10.18653/v1/2020.ngt-1.9
Bibkey:
Cite (ACL):
Anjalie Field, Sascha Rothe, Simon Baumgartner, Cong Yu, and Abe Ittycheriah. 2020. A Generative Approach to Titling and Clustering Wikipedia Sections. In Proceedings of the Fourth Workshop on Neural Generation and Translation, pages 79–87, Online. Association for Computational Linguistics.
Cite (Informal):
A Generative Approach to Titling and Clustering Wikipedia Sections (Field et al., NGT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/2020.ngt-1.9.pdf
Video:
 http://slideslive.com/38929822