Tetsuro Takahashi

2025

A corpus of dialogues between multimodal systems and humans is indispensable for the development and improvement of such systems. However, there is a shortage of human-machine multimodal dialogue datasets, which hinders the widespread deployment of these systems in society. To address this issue, we construct a Japanese multimodal human-machine dialogue corpus, DSLCMM, by collecting and organizing data from the Dialogue System Live Competitions (DSLCs). This paper details the procedure for constructing the corpus and presents our analysis of the relationship between various dialogue features and evaluation scores provided by users.

2018

pdf bib abs
Interpretation of Implicit Conditions in Database Search Dialogues
Shunya Fukunaga | Hitoshi Nishikawa | Takenobu Tokunaga | Hikaru Yokono | Tetsuro Takahashi
Proceedings of the 27th International Conference on Computational Linguistics

Targeting the database search dialogue, we propose to utilise information in the user utterances that do not directly mention the database (DB) field of the backend database system but are useful for constructing database queries. We call this kind of information implicit conditions. Interpreting the implicit conditions enables the dialogue system more natural and efficient in communicating with humans. We formalised the interpretation of the implicit conditions as classifying user utterances into the related DB field while identifying the evidence for that classification at the same time. Introducing this new task is one of the contributions of this paper. We implemented two models for this task: an SVM-based model and an RCNN-based model. Through the evaluation using a corpus of simulated dialogues between a real estate agent and a customer, we found that the SVM-based model showed better performance than the RCNN-based model.

pdf bib
Analysis of Implicit Conditions in Database Search Dialogues
Shun-ya Fukunaga | Hitoshi Nishikawa | Takenobu Tokunaga | Hikaru Yokono | Tetsuro Takahashi
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib abs
Big Community Data before World Wide Web Era
Tomoya Iwakura | Tetsuro Takahashi | Akihiro Ohtani | Kunio Matsui
Proceedings of the 12th Workshop on Asian Language Resources (ALR12)

This paper introduces the NIFTY-Serve corpus, a large data archive collected from Japanese discussion forums that operated via a Bulletin Board System (BBS) between 1987 and 2006. This corpus can be used in Artificial Intelligence researches such as Natural Language Processing, Community Analysis, and so on. The NIFTY-Serve corpus differs from data on WWW in three ways; (1) essentially spam- and duplication-free because of strict data collection procedures, (2) historic user-generated data before WWW, and (3) a complete data set because the service now shut down. We also introduce some examples of use of the corpus.

2003

pdf bib
Text Simplification for Reading Assistance: A Project Note
Kentaro Inui | Atsushi Fujita | Tetsuro Takahashi | Ryu Iida | Tomoya Iwakura
Proceedings of the Second International Workshop on Paraphrasing

Co-authors

Venues

Fix data