Junko Hosaka


PBIE: A Data Preparation Toolkit Toward Developing a Parsing-Based Information Extraction System
Junko Hosaka | Igor V. Kurochkin | Akihiko Konagaya
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

We have developed a toolkit in which an annotation tool, a syntactic tree editor, and an extraction rule editor interact dynamically. Its output can be stored in a database for further use. In the field of biomedicine, there is a critical need for automatic text processing. However, current language processing approaches suffer from insufficient basic data incorporating both human domain expertise and domain-specific language processing capabilities. With the annotation tool presented here, a set of ggold standardsh can be collected, representing what should be extracted. At the same time, any change in annotation can be viewed on an associated syntactic tree. These facilities provide a clear picture of the relationship between the extraction target and the syntactic tree. Underlying sentences can be analyzed with a parser which can be plugged in, or a set of parsed sentences can be used to generate the tree. Extraction rules written with the integrated editor can be applied at once, and their validity can immediately be verified both on the syntactic tree and on the sentence string by coloring the corresponding segments. Thus our toolkit enables the user to efficiently construct parse-based extraction rules. PBIE2 works under Windows 2000/XP and requires Microsoft Internet Explorer 6.0 or higher. The data can be stored in Microsoft Access.


Effect of utilizing terminology on extraction of protein-protein interaction information from biomedical literature
Junko Hosaka | Judice L. Y. Koh | Akihiko Konagaya
10th Conference of the European Chapter of the Association for Computational Linguistics


Pause as a Phrase Demarcator for Speech and Language Processing
Junko Hosaka | Mark Seligman | Harald Singer
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics


Construction of Corpus-Based Syntactic Rules for Accurate Speech Recognition
Junko Hosaka | Toshiyuki Takezawa
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics