Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks

Shikib Mehri, Giuseppe Carenini


Abstract
Thread disentanglement is a precursor to any high-level analysis of multiparticipant chats. Existing research approaches the problem by calculating the likelihood of two messages belonging in the same thread. Our approach leverages a newly annotated dataset to identify reply relationships. Furthermore, we explore the usage of an RNN, along with large quantities of unlabeled data, to learn semantic relationships between messages. Our proposed pipeline, which utilizes a reply classifier and an RNN to generate a set of disentangled threads, is novel and performs well against previous work.
Anthology ID:
I17-1062
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
615–623
Language:
URL:
https://aclanthology.org/I17-1062
DOI:
Bibkey:
Cite (ACL):
Shikib Mehri and Giuseppe Carenini. 2017. Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 615–623, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks (Mehri & Carenini, IJCNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/I17-1062.pdf
Dataset:
 I17-1062.Datasets.tgz