Taro Okahisa


2024

pdf
Domain Transferable Semantic Frames for Expert Interview Dialogues
Taishi Chika | Taro Okahisa | Takashi Kodama | Yin Jou Huang | Yugo Murawaki | Sadao Kurohashi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Interviews are an effective method to elicit critical skills to perform particular processes in various domains. In order to understand the knowledge structure of these domain-specific processes, we consider semantic role and predicate annotation based on Frame Semantics. We introduce a dataset of interview dialogues with experts in the culinary and gardening domains, each annotated with semantic frames. This dataset consists of (1) 308 interview dialogues related to the culinary domain, originally assembled by Okahisa et al. (2022), and (2) 100 interview dialogues associated with the gardening domain, which we newly acquired. The labeling specifications take into account the domain-transferability by adopting domain-agnostic labels for frame elements. In addition, we conducted domain transfer experiments from the culinary domain to the gardening domain to examine the domain transferability with our dataset. The experimental results showed the effectiveness of our domain-agnostic labeling scheme.

2023

pdf
Is a Knowledge-based Response Engaging?: An Analysis on Knowledge-Grounded Dialogue with Information Source Annotation
Takashi Kodama | Hirokazu Kiyomaru | Yin Jou Huang | Taro Okahisa | Sadao Kurohashi
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

Currently, most knowledge-grounded dialogue response generation models focus on reflecting given external knowledge. However, even when conveying external knowledge, humans integrate their own knowledge, experiences, and opinions with external knowledge to make their utterances engaging. In this study, we analyze such human behavior by annotating the utterances in an existing knowledge-grounded dialogue corpus. Each entity in the corpus is annotated with its information source, either derived from external knowledge (database-derived) or the speaker’s own knowledge, experiences, and opinions (speaker-derived). Our analysis shows that the presence of speaker-derived information in the utterance improves dialogue engagingness. We also confirm that responses generated by an existing model, which is trained to reflect the given knowledge, cannot include speaker-derived information in responses as often as humans do.

2022

pdf
Constructing a Culinary Interview Dialogue Corpus with Video Conferencing Tool
Taro Okahisa | Ribeka Tanaka | Takashi Kodama | Yin Jou Huang | Sadao Kurohashi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Interview is an efficient way to elicit knowledge from experts of different domains. In this paper, we introduce CIDC, an interview dialogue corpus in the culinary domain in which interviewers play an active role to elicit culinary knowledge from the cooking expert. The corpus consists of 308 interview dialogues (each about 13 minutes in length), which add up to a total of 69,000 utterances. We use a video conferencing tool for data collection, which allows us to obtain the facial expressions of the interlocutors as well as the screen-sharing contents. To understand the impact of the interlocutors’ skill level, we divide the experts into “semi-professionals’” and “enthusiasts” and the interviewers into “skilled interviewers” and “unskilled interviewers.” For quantitative analysis, we report the statistics and the results of the post-interview questionnaire. We also conduct qualitative analysis on the collected interview dialogues and summarize the salient patterns of how interviewers elicit knowledge from the experts. The corpus serves the purpose to facilitate future research on the knowledge elicitation mechanism in interview dialogues.

2018

pdf
J-MeDic: A Japanese Disease Name Dictionary based on Real Clinical Usage
Kaoru Ito | Hiroyuki Nagai | Taro Okahisa | Shoko Wakamiya | Tomohide Iwao | Eiji Aramaki
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)