Gerhard Wohlgenannt


2016

pdf
Extracting Social Networks from Literary Text with Word Embedding Tools
Gerhard Wohlgenannt | Ekaterina Chernyak | Dmitry Ilvovsky
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH)

In this paper a social network is extracted from a literary text. The social network shows, how frequent the characters interact and how similar their social behavior is. Two types of similarity measures are used: the first applies co-occurrence statistics, while the second exploits cosine similarity on different types of word embedding vectors. The results are evaluated by a paid micro-task crowdsourcing survey. The experiments suggest that specific types of word embeddings like word2vec are well-suited for the task at hand and the specific circumstances of literary fiction text.