Hitoshi Iida


2008

This paper describes a method of automatic emotional degree labeling for speaker’s anger utterances during natural Japanese dialog. First, we explain how to record anger utterance naturally appeared in natural Japanese dialog. Manual emotional degree labeling was conducted in advance to grade the utterances by a 6 Likert scale to obtain a correct anger degree. Then experiments of automatic anger degree estimation were conducted to label an anger degree with each utterance by its acoustic features. Also estimation experiments were conducted with speaker-dependent datasets to find out any influence of individual emotional expression on automatic emotional degree labeling. As a result, almost all the speaker’s models show higher adjusted R square so that those models are superior to the speaker-independent model in those estimation capabilities. However, a residual between automatic emotional degree and manual emotional degree (0.73) is equivalent to those of speaker’s models. There still has a possibility to label utterances with the speaker-independent model.

1999

Speech communication includes many important issues on natural language processing and they are related with desirable advanced speech translation systems. Advanced systems need to be able to handle the interaction for speech communication, pragmatics in speech, and arbitrariness of speech usage. General characteristics of speech communication are discussed. Also the various viewpoints regarding interaction, pragmatics, and arbitrary usage are discussed. Some of the present speech translation approaches are explained and new basic technologies are introduced. In this paper, a synthetic NLP technology such as a composite art form is proposed for speech communication and speech translation.

1998

1997

This paper describes a Transfer-Driven Machine Translation (TDMT) system as a prototype for efficient multi-lingual spoken-dialog translation. Currently, the TDMT system deals with dialogues in the travel domain, such as travel scheduling, hotel reservation, and trouble-shooting, and covers almost all expressions presented in commercially-available travel conversation guides. In addition, to put a speech dialog translation system into practical use, it is necessary to develop a mechanism that can handle the speech recognition errors. In TDMT, robust translation can be achieved by using an example-based correct parts extraction (CPE) technique to translate the plausible parts from speech recognition results even if the results have several recognition errors. We have applied TDMT to three language pairs, i.e., Japanese-English, Japanese-Korean, Japanese-German. Simulations of dialog communication between different language speakers can be provided via a TCP/IP network. In our performance evaluation for the translation of TDMT utilizing 69-87 unseen dialogs, we achieved about 70% acceptability in the JE, KJ translations, almost 60% acceptability in the EJ and JG translations, and about 90% acceptability in the JK translations. In the case of handling erroneous sentences caused by speech recognition errors, although almost all translation results end up as unacceptable translation in conventional methods, 69% of the speech translation results are improved by the CPE technique.

1996

1995

1994

1993

1992

1991

1990

1988

1984