Abstract
Social media has become an important information source for crisis management and provides quick access to ongoing developments and critical information. However, classification models suffer from event-related biases and highly imbalanced label distributions which still poses a challenging task. To address these challenges, we propose a combination of entity-masked language modeling and hierarchical multi-label classification as a multi-task learning problem. We evaluate our method on tweets from the TREC-IS dataset and show an absolute performance gain w.r.t. F1-score of up to 10% for actionable information types. Moreover, we found that entity-masking reduces the effect of overfitting to in-domain events and enables improvements in cross-event generalization.- Anthology ID:
- 2022.nlp4pi-1.9
- Volume:
- Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates (Hybrid)
- Editors:
- Laura Biester, Dorottya Demszky, Zhijing Jin, Mrinmaya Sachan, Joel Tetreault, Steven Wilson, Lu Xiao, Jieyu Zhao
- Venue:
- NLP4PI
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 70–78
- Language:
- URL:
- https://aclanthology.org/2022.nlp4pi-1.9
- DOI:
- 10.18653/v1/2022.nlp4pi-1.9
- Cite (ACL):
- Philipp Seeberger and Korbinian Riedhammer. 2022. Enhancing Crisis-Related Tweet Classification with Entity-Masked Language Modeling and Multi-Task Learning. In Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI), pages 70–78, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Cite (Informal):
- Enhancing Crisis-Related Tweet Classification with Entity-Masked Language Modeling and Multi-Task Learning (Seeberger & Riedhammer, NLP4PI 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2022.nlp4pi-1.9.pdf