Itsuki Toyota


pdf bib
Compositional translation of technical terms by integrating patent families as a parallel corpus and a comparable corpus
Itsuki Toyota | Zi Long | Lijuan Dong | Takehito Utsuro | Mikio Yamamoto
Proceedings of the 5th Workshop on Patent Translation


Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation
Takafumi Suzuki | Yusuke Abe | Itsuki Toyota | Takehito Utsuro | Suguru Matsuyoshi | Masatoshi Tsuchiya
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The Japanese language has various types of functional expressions. In order to organize Japanese functional expressions with various surface forms, a lexicon of Japanese functional expressions with hierarchical organization was compiled. This paper proposes how to design the framework of identifying more than 16,000 functional expressions in Japanese texts by utilizing hierarchical organization of the lexicon. In our framework, more than 16,000 functional expressions are roughly divided into canonical / derived functional expressions. Each derived functional expression is intended to be identified by referring to the most similar occurrence of its canonical expression. In our framework, contextual occurrence information of much fewer canonical expressions are expanded into the whole forms of derived expressions, to be utilized when identifying those derived expressions. We also empirically show that the proposed method can correctly identify more than 80% of the functional / content usages only with less than 38,000 training instances of manually identified canonical expressions.