Abstract
We present an approach where an SVM classifier learns to classify head movements based on measurements of velocity, acceleration, and the third derivative of position with respect to time, jerk. Consequently, annotations of head movements are added to new video data. The results of the automatic annotation are evaluated against manual annotations in the same data and show an accuracy of 68% with respect to these. The results also show that using jerk improves accuracy. We then conduct an investigation of the overlap between temporal sequences classified as either movement or non-movement and the speech stream of the person performing the gesture. The statistics derived from this analysis show that using word features may help increase the accuracy of the model.- Anthology ID:
- W17-2006
- Volume:
- Proceedings of the Sixth Workshop on Vision and Language
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Anya Belz, Erkut Erdem, Katerina Pastra, Krystian Mikolajczyk
- Venue:
- VL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 40–42
- Language:
- URL:
- https://aclanthology.org/W17-2006
- DOI:
- 10.18653/v1/W17-2006
- Cite (ACL):
- Patrizia Paggio, Costanza Navarretta, and Bart Jongejan. 2017. Automatic identification of head movements in video-recorded conversations: can words help?. In Proceedings of the Sixth Workshop on Vision and Language, pages 40–42, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Automatic identification of head movements in video-recorded conversations: can words help? (Paggio et al., VL 2017)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/W17-2006.pdf