Michael Kipp

2014

pdf abs
Single-Person and Multi-Party 3D Visualizations for Nonverbal Communication Analysis
Michael Kipp | Levin Freiherr von Hollen | Michael Christopher Hrstka | Franziska Zamponi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The qualitative analysis of nonverbal communication is more and more relying on 3D recording technology. However, the human analysis of 3D data on a regular 2D screen can be challenging as 3D scenes are difficult to visually parse. To optimally exploit the full depth of the 3D data, we propose to enhance the 3D view with a number of visualizations that clarify spatial and conceptual relationships and add derived data like speed and angles. In this paper, we present visualizations for directional body motion, hand movement direction, gesture space location, and proxemic dimensions like interpersonal distance, movement and orientation. The proposed visualizations are available in the open source tool JMocap and are planned to be fully integrated into the ANVIL video annotation tool. The described techniques are intended to make annotation more efficient and reliable and may allow the discovery of entirely new phenomena.

2012

pdf abs
Annotation Facilities for the Reliable Analysis of Human Motion
Michael Kipp
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Human motion is challenging to analyze due to the many degrees of freedom of the human body. While the qualitative analysis of human motion lies at the core of many research fields, including multimodal communication, it is still hard to achieve reliable results when human coders transcribe motion with abstract categories. In this paper we tackle this problem in two respects. First, we provide facilities for qualitative and quantitative comparison of annotations. Second, we provide facilities for exploring highly precise recordings of human motion (motion capture) using a low-cost consumer device (Kinect). We present visualization and analysis methods, integrated in the existing ANVIL video annotation tool (Kipp 2001), and provide both a precision analysis and a """"cookbook"""" for Kinect-based motion analysis.

pdf abs
Using DiAML and ANVIL for multimodal dialogue annotations
Harry Bunt | Michael Kipp | Volha Petukhova
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

This paper shows how interoperable dialogue act annotations, using the multidimensional annotation scheme and the markup language DiAML of ISO standard 24617-2, can conveniently be obtained using the newly implemented facility in the ANVIL annotation tool to produce XML-based output directly in the DiAML format. ANVIL offers the use of multiple user-defined `tiers' for annotating various kinds of information. This is shown to be convenient not only for multimodal information but also for dialogue act annotation according to ISO standard 24617-2 because of the latter's multidimensionality: functional dialogue segments are viewed as expressing one or more dialogue acts, and every dialogue act belongs to one of a number of dimensions of communication, defined in the standard, for each of which a different ANVIL tier can conveniently be used. Annotations made in the multi-tier interface can be exported in the ISO 24617-2 format, thus supporting the creation of interoperable annotated corpora of multimodal dialogue.

2010

pdf abs
Annotation of Human Gesture using 3D Skeleton Controls
Quan Nguyen | Michael Kipp
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

The manual transcription of human gesture behavior from video for linguistic analysis is a work-intensive process that results in a rather coarse description of the original motion. We present a novel approach for transcribing gestural movements: by overlaying an articulated 3D skeleton onto the video frame(s) the human coder can replicate original motions on a pose-by-pose basis by manipulating the skeleton. Our tool is integrated in the ANVIL tool so that both symbolic interval data and 3D pose data can be entered in a single tool. Our method allows a relatively quick annotation of human poses which has been validated in a user study. The resulting data are precise enough to create animations that match the original speaker's motion which can be validated with a realtime viewer. The tool can be applied for a variety of research topics in the areas of conversational analysis, gesture studies and intelligent virtual agents.

2008

This paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation of multimodality. We propose a multimodal annotation exchange format, based on the annotation graph formalism, which is supported by import and export routines in the respective tools.

pdf abs
Spatiotemporal Coding in ANVIL
Michael Kipp
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present a new coding mechanism, spatiotemporal coding, that allows coders to annotate points and regions in the video frame by drawing directly on the screen. Coders can not only attach labels to time intervals in the video but can specify a possibly moving region on the video screen. This opens up the spatial dimension for multi-track video coding and is an essential asset in almost every area of video coding, e.g. gesture coding, facial expression coding, encoding semantics for information retrieval etc. We discuss conceptual variants, design decisions and the relation to the MPEG-7 standard and tools.

Michael Kipp

2014

2012

2010

2008

2002

2000

Co-authors

Venues