The SUMMA Platform: A Scalable Infrastructure for Multi-lingual Multi-media Monitoring

Ulrich Germann, Renārs Liepins, Guntis Barzdins, Didzis Gosko, Sebastião Miranda, David Nogueira


Abstract
The open-source SUMMA Platform is a highly scalable distributed architecture for monitoring a large number of media broadcasts in parallel, with a lag behind actual broadcast time of at most a few minutes. The Platform offers a fully automated media ingestion pipeline capable of recording live broadcasts, detection and transcription of spoken content, translation of all text (original or transcribed) into English, recognition and linking of Named Entities, topic detection, clustering and cross-lingual multi-document summarization of related media items, and last but not least, extraction and storage of factual claims in these news items. Browser-based graphical user interfaces provide humans with aggregated information as well as structured access to individual news items stored in the Platform’s database. This paper describes the intended use cases and provides an overview over the system’s implementation.
Anthology ID:
P18-4017
Volume:
Proceedings of ACL 2018, System Demonstrations
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
99–104
Language:
URL:
https://aclanthology.org/P18-4017
DOI:
10.18653/v1/P18-4017
Bibkey:
Cite (ACL):
Ulrich Germann, Renārs Liepins, Guntis Barzdins, Didzis Gosko, Sebastião Miranda, and David Nogueira. 2018. The SUMMA Platform: A Scalable Infrastructure for Multi-lingual Multi-media Monitoring. In Proceedings of ACL 2018, System Demonstrations, pages 99–104, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
The SUMMA Platform: A Scalable Infrastructure for Multi-lingual Multi-media Monitoring (Germann et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/P18-4017.pdf