Andrei Malchanau


2018

pdf
The Metalogue Debate Trainee Corpus: Data Collection and Annotations
Volha Petukhova | Andrei Malchanau | Youssef Oualil | Dietrich Klakow | Saturnino Luz | Fasih Haider | Nick Campbell | Dimitris Koryzis | Dimitris Spiliotopoulos | Pierre Albert | Nicklas Linz | Jan Alexandersson
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf
Towards Continuous Dialogue Corpus Creation: writing to corpus and generating from it
Andrei Malchanau | Volha Petukhova | Harry Bunt
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf
Modelling Multi-issue Bargaining Dialogues: Data Collection, Annotation Design and Corpus
Volha Petukhova | Christopher Stevens | Harmen de Weerd | Niels Taatgen | Fokie Cnossen | Andrei Malchanau
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The paper describes experimental dialogue data collection activities, as well semantically annotated corpus creation undertaken within EU-funded METALOGUE project(www.metalogue.eu). The project aims to develop a dialogue system with flexible dialogue management to enable system’s adaptive, reactive, interactive and proactive dialogue behavior in setting goals, choosing appropriate strategies and monitoring numerous parallel interpretation and management processes. To achieve these goals negotiation (or more precisely multi-issue bargaining) scenario has been considered as the specific setting and application domain. The dialogue corpus forms the basis for the design of task and interaction models of participants negotiation behavior, and subsequently for dialogue system development which would be capable to replace one of the negotiators. The METALOGUE corpus will be released to the community for research purposes.

pdf
The DialogBank
Harry Bunt | Volha Petukhova | Andrei Malchanau | Kars Wijnhoven | Alex Fang
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents the DialogBank, a new language resource consisting of dialogues with gold standard annotations according to the ISO 24617-2 standard. Some of these dialogues have been taken from existing corpora and have been re-annotated according to the ISO standard; others have been annotated directly according to the standard. The ISO 24617-2 annotations have been designed according to the ISO principles for semantic annotation, as formulated in ISO 24617-6. The DialogBank makes use of three alternative representation formats, which are shown to be interoperable.

2014

pdf
Interoperability of Dialogue Corpora through ISO 24617-2-based Querying
Volha Petukhova | Andrei Malchanau | Harry Bunt
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper explores a way of achieving interoperability: developing a query format for accessing existing annotated corpora whose expressions make use of the annotation language defined by the standard. The interpretation of expressions in the query implements a mapping from ISO 24617-2 concepts to those of the annotation scheme used in the corpus. We discuss two possible ways to query existing annotated corpora using DiAML. One way is to transform corpora into DiAML compliant format, and subsequently query these data using XQuery or XPath. The second approach is to define a DiAML query that can be directly used to retrieve requested information from the annotated data. Both approaches are valid. The first one presents a standard way of querying XML data. The second approach is a DiAML-oriented querying of dialogue act annotated data, for which we designed an interface. The proposed approach is tested on two important types of existing dialogue corpora: spoken two-person dialogue corpora collected and annotated within the HCRC Map Task paradigm, and multiparty face-to-face dialogues of the AMI corpus. We present the results and evaluate them with respect to accuracy and completeness through statistical comparisons between retrieved and manually constructed reference annotations.