Jun Liu


2021

pdf bib
Analyzing the Forgetting Problem in Pretrain-Finetuning of Open-domain Dialogue Response Models
Tianxing He | Jun Liu | Kyunghyun Cho | Myle Ott | Bing Liu | James Glass | Fuchun Peng
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

In this work, we study how the finetuning stage in the pretrain-finetune framework changes the behavior of a pretrained neural language generator. We focus on the transformer encoder-decoder model for the open-domain dialogue response generation task. Our major finding is that after standard finetuning, the model forgets some of the important language generation skills acquired during large-scale pretraining. We demonstrate the forgetting phenomenon through a set of detailed behavior analysis from the perspectives of knowledge transfer, context sensitivity, and function space projection. As a preliminary attempt to alleviate the forgetting problem, we propose an intuitive finetuning strategy named “mix-review”. We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent. Finally, we discuss interesting behavior of the resulting dialogue model and its implications.

2018

pdf bib
Sentence Suggestion of Japanese Functional Expressions for Chinese-speaking Learners
Jun Liu | Hiroyuki Shindo | Yuji Matsumoto
Proceedings of ACL 2018, System Demonstrations

We present a computer-assisted learning system, Jastudy, which is particularly designed for Chinese-speaking learners of Japanese as a second language (JSL) to learn Japanese functional expressions with suggestion of appropriate example sentences. The system automatically recognizes Japanese functional expressions using a free Japanese morphological analyzer MeCab, which is retrained on a new Conditional Random Fields (CRF) model. In order to select appropriate example sentences, we apply a pairwise-based machine learning tool, Support Vector Machine for Ranking (SVMrank) to estimate the complexity of the example sentences using Japanese–Chinese homographs as an important feature. In addition, we cluster the example sentences that contain Japanese functional expressions with two or more meanings and usages, based on part-of-speech, conjugation forms of verbs and semantic attributes, using the K-means clustering algorithm in Scikit-Learn. Experimental results demonstrate the effectiveness of our approach.

pdf bib
Automatic Error Correction on Japanese Functional Expressions Using Character-based Neural Machine Translation
Jun Liu | Fei Cheng | Yiran Wang | Hiroyuki Shindo | Yuji Matsumoto
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2017

pdf bib
Sentence Complexity Estimation for Chinese-speaking Learners of Japanese
Jun Liu | Yuji Matsumoto
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation

2016

pdf bib
Simplification of Example Sentences for Learners of Japanese Functional Expressions
Jun Liu | Yuji Matsumoto
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

Learning functional expressions is one of the difficulties for language learners, since functional expressions tend to have multiple meanings and complicated usages in various situations. In this paper, we report an experiment of simplifying example sentences of Japanese functional expressions especially for Chinese-speaking learners. For this purpose, we developed “Japanese Functional Expressions List” and “Simple Japanese Replacement List”. To evaluate the method, we conduct a small-scale experiment with Chinese-speaking learners on the effectiveness of the simplified example sentences. The experimental results indicate that simplified sentences are helpful in learning Japanese functional expressions.

2010

pdf bib
CMDMC: A Diachronic Digital Museum of Chinese Mandarin
Min Hou | Yu Zou | Yonglin Teng | Wei He | Yan Wang | Jun Liu | Jiyuan Wu
CIPS-SIGHAN Joint Conference on Chinese Language Processing