2008
pdf
abs
Test Collections for Spoken Document Retrieval from Lecture Audio Data
Tomoyosi Akiba
|
Kiyoaki Aikawa
|
Yoshiaki Itoh
|
Tatsuya Kawahara
|
Hiroaki Nanjo
|
Hiromitsu Nishizaki
|
Norihito Yasuda
|
Yoichi Yamashita
|
Katunobu Itou
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
The Spoken Document Processing Working Group, which is part of the special interest group of spoken language processing of the Information Processing Society of Japan, is developing a test collection for evaluation of spoken document retrieval systems. A prototype of the test collection consists of a set of textual queries, relevant segment lists, and transcriptions by an automatic speech recognition system, allowing retrieval from the Corpus of Spontaneous Japanese (CSJ). From about 100 initial queries, application of the criteria that a query should have more than five relevant segments that consist of about one minute speech segments yielded 39 queries. Targeting the test collection, an ad hoc retrieval experiment was also conducted to assess the baseline retrieval performance by applying a standard method for spoken document retrieval.
pdf
abs
In-car Speech Data Collection along with Various Multimodal Signals
Akira Ozaki
|
Sunao Hara
|
Takashi Kusakawa
|
Chiyomi Miyajima
|
Takanori Nishino
|
Norihide Kitaoka
|
Katunobu Itou
|
Kazuya Takeda
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper, a large-scale real-world speech database is introduced along with other multimedia driving data. We designed a data collection vehicle equipped with various sensors to synchronously record twelve-channel speech, three-channel video, driving behavior including gas and brake pedal pressures, steering angles, and vehicle velocities, physiological signals including driver heart rate, skin conductance, and emotion-based sweating on the palms and soles, etc. These multimodal data are collected while driving on city streets and expressways under four different driving task conditions including two kinds of monologues, human-human dialog, and human-machine dialog. We investigated the response timing of drivers against navigator utterances and found that most overlapped with the preceding utterance due to the task characteristics and the features of Japanese. When comparing utterance length, speaking rate, and the filler rate of driver utterances in human-human and human-machine dialogs, we found that drivers tended to use longer and faster utterances with more fillers to talk with humans than machines.
2006
pdf
abs
Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus
Yasunori Ohishi
|
Katunobu Itou
|
Kazuya Takeda
|
Atsushi Fujii
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
This paper proposes a discrimination method for hierarchical relationsbetween word pairs. The method is a statistical one using an encyclopedic corpus' extracted and organized from Web pages. In the proposed method, we use the statistical naturethat hyponyms' descriptionstend to include hypernyms whereas hypernyms' descriptions do notinclude all of the hyponyms.Experimental results show that the method detected 61.7% of therelations in an actual thesaurus.
2004
pdf
Collecting Spontaneously Spoken Queries for Information Retrieval
Tomoyosi Akiba
|
Atsushi Fujii
|
Katunobu Itou
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
2002
pdf
Producing a Large-scale Encyclopedic Corpus over the Web
Atsushi Fujii
|
Katunobu Itou
|
Tetsuya Ishikawa
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)
pdf
A Method for Open-Vocabulary Speech-Driven Text Retrieval
Atsushi Fujii
|
Katunobu Itou
|
Tetsuya Ishikawa
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)