Masato Tokuhisa


2014

pdf
Automatic Detection and Analysis of Impressive Japanese Sentences Using Supervised Machine Learning
Daiki Hazure | Masaki Murata | Masato Tokuhisa
Proceedings of the First AHA!-Workshop on Information Discovery in Text

2009

pdf
Statistical machine translation adding pattern-based machine translation in Chinese-English translation
Jin’ichi Murakami | Masato Tokuhisa | Satoru Ikehara
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign

We have developed a two-stage machine translation (MT) system. The first stage is a rule-based machine translation system. The second stage is a normal statistical machine translation system. For Chinese-English machine translation, first, we used a Chinese-English rule-based MT, and we obtained ”ENGLISH” sentences from Chinese sentences. Second, we used a standard statistical machine translation. This means that we translated ”ENGLISH” to English machine translation. We believe this method has two advantages. One is that there are fewer unknown words. The other is that it produces structured or grammatically correct sentences. From the results of experiments, we obtained a BLEU score of 0.3151 in the BTEC-CE task using our proposed method. In contrast, we obtained a BLEU score of 0.3311 in the BTEC-CE task using a standard method (moses). This means that our proposed method was not as effective for the BTEC-CE task. Therefore, we will try to improve the performance by optimizing parameters.

2008

pdf
Non-Compositional Language Model and Pattern Dictionary Development for Japanese Compound and Complex Sentences
Satoru Ikehara | Masato Tokuhisa | Jin’ichi Murakami
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf
Statistical machine translation without long parallel sentences for training data.
Jin’ichi Murakami | Masato Tokuhisa | Satoru Ikehara
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

In this study, we paid attention to the reliability of phrase table. We have been used the phrase table using Och’s method[2]. And this method sometimes generate completely wrong phrase tables. We found that such phrase table caused by long parallel sentences. Therefore, we removed these long parallel sentences from training data. Also, we utilized general tools for statistical machine translation, such as ”Giza++”[3], ”moses”[4], and ”training-phrase-model.perl”[5]. We obtained a BLEU score of 0.4047 (TEXT) and 0.3553(1-BEST) of the Challenge-EC task for our proposed method. On the other hand, we obtained a BLEU score of 0.3975(TEXT) and 0.3482(1-BEST) of the Challenge-EC task for a standard method. This means that our proposed method was effective for the Challenge-EC task. However, it was not effective for the BTECT-CE and Challenge-CE tasks. And our system was not good performance. For example, our system was the 7th place among 8 system for Challenge-EC task.

2007

pdf
Statistical machine translation using large J/E parallel corpus and long phrase tables
Jin’ichi Murakami | Masato Tokuhisa | Satoru Ikehara
Proceedings of the Fourth International Workshop on Spoken Language Translation

Our statistical machine translation system that uses large Japanese-English parallel sentences and long phrase tables is described. We collected 698,973 Japanese-English parallel sentences, and we used long phrase tables. Also, we utilized general tools for statistical machine translation, such as ”Giza++”[1], ”moses”[2], and ”training-phrasemodel.perl”[3]. We used these data and these tools, We challenge the contest for IWSLT07. In which task was the result (0.4321 BLEU) obtained.