Data-Driven News Generation for Automated Journalism

Leo Leppänen, Myriam Munezero, Mark Granroth-Wilding, Hannu Toivonen


Abstract
Despite increasing amounts of data and ever improving natural language generation techniques, work on automated journalism is still relatively scarce. In this paper, we explore the field and challenges associated with building a journalistic natural language generation system. We present a set of requirements that should guide system design, including transparency, accuracy, modifiability and transferability. Guided by the requirements, we present a data-driven architecture for automated journalism that is largely domain and language independent. We illustrate its practical application in the production of news articles about the 2017 Finnish municipal elections in three languages, demonstrating the successfulness of the data-driven, modular approach of the design. We then draw some lessons for future automated journalism.
Anthology ID:
W17-3528
Volume:
Proceedings of the 10th International Conference on Natural Language Generation
Month:
September
Year:
2017
Address:
Santiago de Compostela, Spain
Editors:
Jose M. Alonso, Alberto Bugarín, Ehud Reiter
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
188–197
Language:
URL:
https://aclanthology.org/W17-3528
DOI:
10.18653/v1/W17-3528
Bibkey:
Cite (ACL):
Leo Leppänen, Myriam Munezero, Mark Granroth-Wilding, and Hannu Toivonen. 2017. Data-Driven News Generation for Automated Journalism. In Proceedings of the 10th International Conference on Natural Language Generation, pages 188–197, Santiago de Compostela, Spain. Association for Computational Linguistics.
Cite (Informal):
Data-Driven News Generation for Automated Journalism (Leppänen et al., INLG 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/W17-3528.pdf