Streaming Text Analytics for Real-Time Event Recognition

Philippe Thomas, Johannes Kirschnick, Leonhard Hennig, Renlong Ai, Sven Schmeier, Holmer Hemsen, Feiyu Xu, Hans Uszkoreit


Abstract
A huge body of continuously growing written knowledge is available on the web in the form of social media posts, RSS feeds, and news articles. Real-time information extraction from such high velocity, high volume text streams requires scalable, distributed natural language processing pipelines. We introduce such a system for fine-grained event recognition within the big data framework Flink, and demonstrate its capabilities for extracting and geo-locating mobility- and industry-related events from heterogeneous text sources. Performance analyses conducted on several large datasets show that our system achieves high throughput and maintains low latency, which is crucial when events need to be detected and acted upon in real-time. We also present promising experimental results for the event extraction component of our system, which recognizes a novel set of event types. The demo system is available at http://dfki.de/sd4m-sta-demo/.
Anthology ID:
R17-1096
Volume:
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
750–757
Language:
URL:
https://doi.org/10.26615/978-954-452-049-6_096
DOI:
10.26615/978-954-452-049-6_096
Bibkey:
Cite (ACL):
Philippe Thomas, Johannes Kirschnick, Leonhard Hennig, Renlong Ai, Sven Schmeier, Holmer Hemsen, Feiyu Xu, and Hans Uszkoreit. 2017. Streaming Text Analytics for Real-Time Event Recognition. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 750–757, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Streaming Text Analytics for Real-Time Event Recognition (Thomas et al., RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-049-6_096