A Syntactically Annotated Corpus of Japanese Spoken Monologue
Tomohiro Ohno, Shigeki Matsubara, Hideki Kashioka, Naoto Kato, Yasuyoshi Inagaki
Abstract
Recently, monologue data such as lecture and commentary by professionals have been considered as valuable intellectual resources, and have been gathering attention. On the other hand, in order to use these monologue data effectively and efficiently, it is necessary for the monologue data not only just to be accumulated but also to be structured. This paper describes the construction of a Japanese spoken monologue corpus in which dependency structure is given to each utterance. Spontaneous monologue includes a lot of very long sentences composed of two or more clauses. In these sentences, there may exist the subject or the adverb common to multi-clauses, and it may be considered that the subject or adverb depend on multi-predicates. In order to give the dependency information in a real fashion, our research allows that a bunsetsu depends on multiple bunsetsus.- Anthology ID:
- L06-1056
- Volume:
- Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
- Month:
- May
- Year:
- 2006
- Address:
- Genoa, Italy
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/106_pdf.pdf
- DOI:
- Cite (ACL):
- Tomohiro Ohno, Shigeki Matsubara, Hideki Kashioka, Naoto Kato, and Yasuyoshi Inagaki. 2006. A Syntactically Annotated Corpus of Japanese Spoken Monologue. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
- Cite (Informal):
- A Syntactically Annotated Corpus of Japanese Spoken Monologue (Ohno et al., LREC 2006)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2006/pdf/106_pdf.pdf