Maitreyee Maitreyee


pdf bib
Beyond Adjacency Pairs: Hierarchical Clustering of Long Sequences for Human-Machine Dialogues
Maitreyee Maitreyee
Proceedings of the First Workshop on Computational Approaches to Discourse

This work proposes a framework to predict sequences in dialogues, using turn based syntactic features and dialogue control functions. Syntactic features were extracted using dependency parsing, while dialogue control functions were manually labelled. These features were transformed using tf-idf and word embedding; feature selection was done using Principal Component Analysis (PCA). We ran experiments on six combinations of features to predict sequences with Hierarchical Agglomerative Clustering. An analysis of the clustering results indicate that using word embeddings and syntactic features, significantly improved the results.