Developing Corpus of Japanese Classroom Lecture Speech Contents
Masatoshi Tsuchiya, Satoru Kogure, Hiromitsu Nishizaki, Kengo Ohta, Seiichi Nakagawa
Abstract
This paper explains our developing Corpus of Japanese classroom Lecture speech Contents (henceforth, denoted as CJLC). Increasing e-Learning contents demand a sophisticated interactive browsing system for themselves, however, existing tools do not satisfy such a requirement. Many researches including large vocabulary continuous speech recognition and extraction of important sentences against lecture contents are necessary in order to realize the above system. CJLC is designed as their fundamental basis, and consists of speech, transcriptions, and slides that were collected in real university classroom lectures. This paper also explains the difference about disfluency acts between classroom lectures and academic presentations.- Anthology ID:
- L08-1506
- Volume:
- Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
- Month:
- May
- Year:
- 2008
- Address:
- Marrakech, Morocco
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2008/pdf/524_paper.pdf
- DOI:
- Cite (ACL):
- Masatoshi Tsuchiya, Satoru Kogure, Hiromitsu Nishizaki, Kengo Ohta, and Seiichi Nakagawa. 2008. Developing Corpus of Japanese Classroom Lecture Speech Contents. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
- Cite (Informal):
- Developing Corpus of Japanese Classroom Lecture Speech Contents (Tsuchiya et al., LREC 2008)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2008/pdf/524_paper.pdf