Evgeny Stepanov

Also published as: Evgeny A. Stepanov

2019

pdf abs
Affective Behaviour Analysis of On-line User Interactions: Are On-line Support Groups More Therapeutic than Twitter?
Giuliano Tortoreto | Evgeny Stepanov | Alessandra Cervone | Mateusz Dubiel | Giuseppe Riccardi
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

The increase in the prevalence of mental health problems has coincided with a growing popularity of health related social networking sites. Regardless of their therapeutic potential, on-line support groups (OSGs) can also have negative effects on patients. In this work we propose a novel methodology to automatically verify the presence of therapeutic factors in social networking websites by using Natural Language Processing (NLP) techniques. The methodology is evaluated on on-line asynchronous multi-party conversations collected from an OSG and Twitter. The results of the analysis indicate that therapeutic factors occur more frequently in OSG conversations than in Twitter conversations. Moreover, the analysis of OSG conversations reveals that the users of that platform are supportive, and interactions are likely to lead to the improvement of their emotional state. We believe that our method provides a stepping stone towards automatic analysis of emotional states of users of online platforms. Possible applications of the method include provision of guidelines that highlight potential implications of using such platforms on users’ mental health, and/or support in the analysis of their impact on specific individuals.

2018

pdf abs
ISO-Standard Domain-Independent Dialogue Act Tagging for Conversational Agents
Stefano Mezza | Alessandra Cervone | Evgeny Stepanov | Giuliano Tortoreto | Giuseppe Riccardi
Proceedings of the 27th International Conference on Computational Linguistics

Dialogue Act (DA) tagging is crucial for spoken language understanding systems, as it provides a general representation of speakers’ intents, not bound to a particular dialogue system. Unfortunately, publicly available data sets with DA annotation are all based on different annotation schemes and thus incompatible with each other. Moreover, their schemes often do not cover all aspects necessary for open-domain human-machine interaction. In this paper, we propose a methodology to map several publicly available corpora to a subset of the ISO standard, in order to create a large task-independent training corpus for DA classification. We show the feasibility of using this corpus to train a domain-independent DA tagger testing it on out-of-domain conversational data, and argue the importance of training on multiple corpora to achieve robustness across different DA categories.

2017

pdf abs
Automatic Community Creation for Abstractive Spoken Conversations Summarization
Karan Singla | Evgeny Stepanov | Ali Orkan Bayer | Giuseppe Carenini | Giuseppe Riccardi
Proceedings of the Workshop on New Frontiers in Summarization

Summarization of spoken conversations is a challenging task, since it requires deep understanding of dialogs. Abstractive summarization techniques rely on linking the summary sentences to sets of original conversation sentences, i.e. communities. Unfortunately, such linking information is rarely available or requires trained annotators. We propose and experiment automatic community creation using cosine similarity on different levels of representation: raw text, WordNet SynSet IDs, and word embeddings. We show that the abstractive summarization systems with automatic communities significantly outperform previously published results on both English and Italian corpora.

pdf bib abs
Functions of Silences towards Information Flow in Spoken Conversation
Shammur Absar Chowdhury | Evgeny Stepanov | Morena Danieli | Giuseppe Riccardi
Proceedings of the Workshop on Speech-Centric Natural Language Processing

Silence is an integral part of the most frequent turn-taking phenomena in spoken conversations. Silence is sized and placed within the conversation flow and it is coordinated by the speakers along with the other speech acts. The objective of this analytical study is twofold: to explore the functions of silence with duration of one second and above, towards information flow in a dyadic conversation utilizing the sequences of dialog acts present in the turns surrounding the silence itself; and to design a feature space useful for clustering the silences using a hierarchical concept formation algorithm. The resulting clusters are manually grouped into functional categories based on their similarities. It is observed that the silence plays an important role in response preparation, also can indicate speakers’ hesitation or indecisiveness. It is also observed that sometimes long silences can be used deliberately to get a forced response from another speaker thus making silence a multi-functional and an important catalyst towards information flow.

2016

pdf abs
Predicting Brexit: Classifying Agreement is Better than Sentiment and Pollsters
Fabio Celli | Evgeny Stepanov | Massimo Poesio | Giuseppe Riccardi
Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES)

On June 23rd 2016, UK held the referendum which ratified the exit from the EU. While most of the traditional pollsters failed to forecast the final vote, there were online systems that hit the result with high accuracy using opinion mining techniques and big data. Starting one month before, we collected and monitored millions of posts about the referendum from social media conversations, and exploited Natural Language Processing techniques to predict the referendum outcome. In this paper we discuss the methods used by traditional pollsters and compare it to the predictions based on different opinion mining techniques. We find that opinion mining based on agreement/disagreement classification works better than opinion mining based on polarity classification in the forecast of the referendum outcome.

pdf abs
The Social Mood of News: Self-reported Annotations to Design Automatic Mood Detection Systems
Firoj Alam | Fabio Celli | Evgeny A. Stepanov | Arindam Ghosh | Giuseppe Riccardi
Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES)

In this paper, we address the issue of automatic prediction of readers’ mood from newspaper articles and comments. As online newspapers are becoming more and more similar to social media platforms, users can provide affective feedback, such as mood and emotion. We have exploited the self-reported annotation of mood categories obtained from the metadata of the Italian online newspaper corriere.it to design and evaluate a system for predicting five different mood categories from news articles and comments: indignation, disappointment, worry, satisfaction, and amusement. The outcome of our experiments shows that overall, bag-of-word-ngrams perform better compared to all other feature sets; however, stylometric features perform better for the mood score prediction of articles. Our study shows that self-reported annotations can be used to design automatic mood prediction systems.

pdf
Do We Really Need All Those Rich Linguistic Features? A Neural Network-Based Approach to Implicit Sense Labeling
Niko Schenk | Christian Chiarcos | Kathrin Donandt | Samuel Rönnqvist | Evgeny Stepanov | Giuseppe Riccardi
Proceedings of the CoNLL-16 shared task

pdf
UniTN End-to-End Discourse Parser for CoNLL 2016 Shared Task
Evgeny Stepanov | Giuseppe Riccardi
Proceedings of the CoNLL-16 shared task

pdf abs
Transfer of Corpus-Specific Dialogue Act Annotation to ISO Standard: Is it worth it?
Shammur Absar Chowdhury | Evgeny Stepanov | Giuseppe Riccardi
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Spoken conversation corpora often adapt existing Dialogue Act (DA) annotation specifications, such as DAMSL, DIT++, etc., to task specific needs, yielding incompatible annotations; thus, limiting corpora re-usability. Recently accepted ISO standard for DA annotation – Dialogue Act Markup Language (DiAML) – is designed as domain and application independent. Moreover, the clear separation of dialogue dimensions and communicative functions, coupled with the hierarchical organization of the latter, allows for classification at different levels of granularity. However, re-annotating existing corpora with the new scheme might require significant effort. In this paper we test the utility of the ISO standard through comparative evaluation of the corpus-specific legacy and the semi-automatically transferred DiAML DA annotations on supervised dialogue act classification task. To test the domain independence of the resulting annotations, we perform cross-domain and data aggregation evaluation. Compared to the legacy annotation scheme, on the Italian LUNA Human-Human corpus, the DiAML annotation scheme exhibits better cross-domain and data aggregation classification performance, while maintaining comparable in-domain performance.

pdf abs
Summarizing Behaviours: An Experiment on the Annotation of Call-Centre Conversations
Morena Danieli | Balamurali A R | Evgeny Stepanov | Benoit Favre | Frederic Bechet | Giuseppe Riccardi
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Annotating and predicting behavioural aspects in conversations is becoming critical in the conversational analytics industry. In this paper we look into inter-annotator agreement of agent behaviour dimensions on two call center corpora. We find that the task can be annotated consistently over time, but that subjectivity issues impacts the quality of the annotation. The reformulation of some of the annotated dimensions is suggested in order to improve agreement.

2015

pdf
The UniTN Discourse Parser in CoNLL 2015 Shared Task: Token-level Sequence Labeling with Argument-specific Models
Evgeny Stepanov | Giuseppe Riccardi | Ali Orkan Bayer
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task

pdf
Call Centre Conversation Summarization: A Pilot Task at Multiling 2015
Benoit Favre | Evgeny Stepanov | Jérémy Trione | Frédéric Béchet | Giuseppe Riccardi
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf
Towards Cross-Domain PDTB-Style Discourse Parsing
Evgeny Stepanov | Giuseppe Riccardi
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)

pdf abs
The Development of the Multilingual LUNA Corpus for Spoken Language System Porting
Evgeny Stepanov | Giuseppe Riccardi | Ali Orkan Bayer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The development of annotated corpora is a critical process in the development of speech applications for multiple target languages. While the technology to develop a monolingual speech application has reached satisfactory results (in terms of performance and effort), porting an existing application from a source language to a target language is still a very expensive task. In this paper we address the problem of creating multilingual aligned corpora and its evaluation in the context of a spoken language understanding (SLU) porting task. We discuss the challenges of the manual creation of multilingual corpora, as well as present the algorithms for the creation of multilingual SLU via Statistical Machine Translation (SMT).