Yoshihiko Gotoh


Natural Language Descriptions for Human Activities in Video Streams
Nouf Alharbi | Yoshihiko Gotoh
Proceedings of the 10th International Conference on Natural Language Generation

There has been continuous growth in the volume and ubiquity of video material. It has become essential to define video semantics in order to aid the searchability and retrieval of this data. We present a framework that produces textual descriptions of video, based on the visual semantic content. Detected action classes rendered as verbs, participant objects converted to noun phrases, visual properties of detected objects rendered as adjectives and spatial relations between objects rendered as prepositions. Further, in cases of zero-shot action recognition, a language model is used to infer a missing verb, aided by the detection of objects and scene settings. These extracted features are converted into textual descriptions using a template-based approach. The proposed video descriptions framework evaluated on the NLDHA dataset using ROUGE scores and human judgment evaluation.


Natural Language Descriptions of Human Activities Scenes: Corpus Generation and Analysis
Nouf Alharbi | Yoshihiko Gotoh
Proceedings of the 5th Workshop on Vision and Language


Natural Language Descriptions of Visual Scenes Corpus Generation and Analysis
Muhammad Usman Ghani Khan | Rao Muhammad Adeel Nawab | Yoshihiko Gotoh
Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)

Describing Video Contents in Natural Language
Muhammad Usman Ghani Khan | Yoshihiko Gotoh
Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data


pdf bib
On the Subjectivity of Human Authored Summaries
BalaKrishna Kolluru | Yoshihiko Gotoh
Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization