Abstract
Thread disentanglement is a precursor to any high-level analysis of multiparticipant chats. Existing research approaches the problem by calculating the likelihood of two messages belonging in the same thread. Our approach leverages a newly annotated dataset to identify reply relationships. Furthermore, we explore the usage of an RNN, along with large quantities of unlabeled data, to learn semantic relationships between messages. Our proposed pipeline, which utilizes a reply classifier and an RNN to generate a set of disentangled threads, is novel and performs well against previous work.- Anthology ID:
- I17-1062
- Volume:
- Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- November
- Year:
- 2017
- Address:
- Taipei, Taiwan
- Editors:
- Greg Kondrak, Taro Watanabe
- Venue:
- IJCNLP
- SIG:
- Publisher:
- Asian Federation of Natural Language Processing
- Note:
- Pages:
- 615–623
- Language:
- URL:
- https://aclanthology.org/I17-1062
- DOI:
- Cite (ACL):
- Shikib Mehri and Giuseppe Carenini. 2017. Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 615–623, Taipei, Taiwan. Asian Federation of Natural Language Processing.
- Cite (Informal):
- Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks (Mehri & Carenini, IJCNLP 2017)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/I17-1062.pdf