Streaming Text Analytics for Real-Time Event Recognition
Philippe Thomas, Johannes Kirschnick, Leonhard Hennig, Renlong Ai, Sven Schmeier, Holmer Hemsen, Feiyu Xu, Hans Uszkoreit
Abstract
A huge body of continuously growing written knowledge is available on the web in the form of social media posts, RSS feeds, and news articles. Real-time information extraction from such high velocity, high volume text streams requires scalable, distributed natural language processing pipelines. We introduce such a system for fine-grained event recognition within the big data framework Flink, and demonstrate its capabilities for extracting and geo-locating mobility- and industry-related events from heterogeneous text sources. Performance analyses conducted on several large datasets show that our system achieves high throughput and maintains low latency, which is crucial when events need to be detected and acted upon in real-time. We also present promising experimental results for the event extraction component of our system, which recognizes a novel set of event types. The demo system is available at http://dfki.de/sd4m-sta-demo/.- Anthology ID:
- R17-1096
- Volume:
- Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 750–757
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-049-6_096
- DOI:
- 10.26615/978-954-452-049-6_096
- Cite (ACL):
- Philippe Thomas, Johannes Kirschnick, Leonhard Hennig, Renlong Ai, Sven Schmeier, Holmer Hemsen, Feiyu Xu, and Hans Uszkoreit. 2017. Streaming Text Analytics for Real-Time Event Recognition. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 750–757, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Streaming Text Analytics for Real-Time Event Recognition (Thomas et al., RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-049-6_096