Shun Hasegawa


2017

pdf
Japanese Sentence Compression with a Large Training Dataset
Shun Hasegawa | Yuta Kikuchi | Hiroya Takamura | Manabu Okumura
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

In English, high-quality sentence compression models by deleting words have been trained on automatically created large training datasets. We work on Japanese sentence compression by a similar approach. To create a large Japanese training dataset, a method of creating English training dataset is modified based on the characteristics of the Japanese language. The created dataset is used to train Japanese sentence compression models based on the recurrent neural network.