Seiichi Nakagawa

Also published as: S. Nakagawa


Developing Partially-Transcribed Speech Corpus from Edited Transcriptions
Kengo Ohta | Masatoshi Tsuchiya | Seiichi Nakagawa
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Large-scale spontaneous speech corpora are crucial resource for various domains of spoken language processing. However, the available corpora are usually limited because their construction cost is quite expensive especially in transcribing speech precisely. On the other hand, loosely transcribed corpora like shorthand notes, meeting records and closed captions are more widely available than precisely transcribed ones, because their imperfectness reduces their construction cost. Because these corpora contain both precisely transcribed regions and edited regions, it is difficult to use them directly as speech corpora for learning acoustic models. Under this background, we have been considering to build an efficient semi-automatic framework to convert loose transcriptions to precise ones. This paper describes an improved automatic detection method of precise regions from loosely transcribed corpora for the above framework. Our detection method consists of two steps: the first step is a force alignment between loose transcriptions and their utterances to discover the corresponding utterance for the certain loose transcription, and the second step is a detector of precise regions with a support vector machine using several features obtained from the first step. Our experimental result shows that our method achieves a high accuracy of detecting precise regions, and shows that the precise regions extracted by our method are effective as training labels of lightly supervised speaker adaptation.


Analysis and Robust Extraction of Changing Named Entities
Masatoshi Tsuchiya | Shoko Endo | Seiichi Nakagawa
Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)


Developing Corpus of Japanese Classroom Lecture Speech Contents
Masatoshi Tsuchiya | Satoru Kogure | Hiromitsu Nishizaki | Kengo Ohta | Seiichi Nakagawa
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper explains our developing Corpus of Japanese classroom Lecture speech Contents (henceforth, denoted as CJLC). Increasing e-Learning contents demand a sophisticated interactive browsing system for themselves, however, existing tools do not satisfy such a requirement. Many researches including large vocabulary continuous speech recognition and extraction of important sentences against lecture contents are necessary in order to realize the above system. CJLC is designed as their fundamental basis, and consists of speech, transcriptions, and slides that were collected in real university classroom lectures. This paper also explains the difference about disfluency acts between classroom lectures and academic presentations.

Robust Extraction of Named Entity Including Unfamiliar Word
Masatoshi Tsuchiya | Shinya Hida | Seiichi Nakagawa
Proceedings of ACL-08: HLT, Short Papers


Expanding Indonesian-Japanese Small Translation Dictionary Using a Pivot Language
Masatoshi Tsuchiya | Ayu Purwarianti | Toshiyuki Wakita | Seiichi Nakagawa
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions


pdf bib
Indonesian-Japanese CLIR Using Only Limited Resource
Ayu Purwarianti | Masatoshi Tsuchiya | Seiichi Nakagawa
Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?

Chunking Japanese Compound Functional Expressions by Machine Learning
Masatoshi Tsuchiya | Takao Shime | Toshihiro Takagi | Takehito Utsuro | Kiyotaka Uchimoto | Suguru Matsuyoshi | Satoshi Sato | Seiichi Nakagawa
Proceedings of the Workshop on Multi-word-expressions in a multilingual context


Integrating Cross-Lingually Relevant News Articles and Monolingual Web Documents in Bilingual Lexicon Acquisition
Takehito Utsuro | Kohei Hino | Mitsuhiro Kida | Seiichi Nakagawa | Satoshi Sato
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

An Empirical Study on Multiple LVCSR Model Combination by Machine Learning
Takehito Utsuro | Yasuhiro Kodama | Tomohiro Watanabe | Hiromitsu Nishizaki | Seiichi Nakagawa
Proceedings of HLT-NAACL 2004: Short Papers


Interpreter for Highly Portable Spoken Dialogue System
Masamitsu Umeda | Satoru Kogure | Seiichi Nakagawa
Proceedings of the Fourth SIGdial Workshop of Discourse and Dialogue


English Speech Database Read by Japanese Learners for CALL System Development
N. Minematsu | Y. Tomiyama | K. Yoshimoto | K. Shimizu | S. Nakagawa | M. Dantsuji | S. Makino
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)


A Robust Dialogue System with Spontaneous Speech Understanding and Cooperative Response
Toshihiko Itoh | Akihiro Denda | Satoru Kogure | Seiichi Nakagawa
Interactive Spoken Dialog Systems: Bringing Speech and NLP Together in Real Applications