Tadaaki Oshio


2017

pdf
Comparison of SMT and NMT trained with large Patent Corpora: Japio at WAT2017
Satoshi Kinoshita | Tadaaki Oshio | Tomoharu Mitsuhashi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

Japio participates in patent subtasks (JPC-EJ/JE/CJ/KJ) with phrase-based statistical machine translation (SMT) and neural machine translation (NMT) systems which are trained with its own patent corpora in addition to the subtask corpora provided by organizers of WAT2017. In EJ and CJ subtasks, SMT and NMT systems whose sizes of training corpora are about 50 million and 10 million sentence pairs respectively achieved comparable scores for automatic evaluations, but NMT systems were superior to SMT systems for both official and in-house human evaluations.

2016

pdf
Translation Using JAPIO Patent Corpora: JAPIO at WAT2016
Satoshi Kinoshita | Tadaaki Oshio | Tomoharu Mitsuhashi | Terumasa Ehara
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

We participate in scientific paper subtask (ASPEC-EJ/CJ) and patent subtask (JPC-EJ/CJ/KJ) with phrase-based SMT systems which are trained with its own patent corpora. Using larger corpora than those prepared by the workshop organizer, we achieved higher BLEU scores than most participants in EJ and CJ translations of patent subtask, but in crowdsourcing evaluation, our EJ translation, which is best in all automatic evaluations, received a very poor score. In scientific paper subtask, our translations are given lower scores than most translations that are produced by translation engines trained with the in-domain corpora. But our scores are higher than those of general-purpose RBMTs and online services. Considering the result of crowdsourcing evaluation, it shows a possibility that CJ SMT system trained with a large patent corpus translates non-patent technical documents at a practical level.

1987

pdf
Applied Testing of HICATS/JE for Japanese Patent Abstracts
Tadaaki Oshio
Proceedings of Machine Translation Summit I