2023
pdf
abs
StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing
Xuekai Zhu
|
Jian Guan
|
Minlie Huang
|
Juan Liu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Non-parallel text style transfer is an important task in natural language generation. However, previous studies concentrate on the token or sentence level, such as sentence sentiment and formality transfer, but neglect long style transfer at the discourse level. Long texts usually involve more complicated author linguistic preferences such as discourse structures than sentences. In this paper, we formulate the task of non-parallel story author-style transfer, which requires transferring an input story into a specified author style while maintaining source semantics. To tackle this problem, we propose a generation model, named StoryTrans, which leverages discourse representations to capture source content information and transfer them to target styles with learnable style embeddings. We use an additional training objective to disentangle stylistic features from the learned discourse representation to prevent the model from degenerating to an auto-encoder. Moreover, to enhance content preservation, we design a mask-and-fill framework to explicitly fuse style-specific keywords of source texts into generation. Furthermore, we constructed new datasets for this task in Chinese and English, respectively. Extensive experiments show that our model outperforms strong baselines in overall performance of style transfer and content preservation.
2014
pdf
abs
A tunable language model for statistical machine translation
Junfei Guo
|
Juan Liu
|
Qi Han
|
Andreas Maletti
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
A novel variation of modified KNESER-NEY model using monomial discounting is presented and integrated into the MOSES statistical machine translation toolkit. The language model is trained on a large training set as usual, but its new discount parameters are tuned to the small development set. An in-domain and cross-domain evaluation of the language model is performed based on perplexity, in which sizable improvements are obtained. Additionally, the performance of the language model is also evaluated in several major machine translation tasks including Chinese-to-English. In those tests, the test data is from a (slightly) different domain than the training data. The experimental results indicate that the new model significantly outperforms a baseline model using SRILM in those domain adaptation scenarios. The new language model is thus ideally suited for domain adaptation without sacrificing performance on in-domain experiments.
2011
pdf
Deploying MT into a Localisation Workflow: Pains and Gains
Yanli Sun
|
Juan Liu
|
Yi Li
Proceedings of Machine Translation Summit XIII: Papers
pdf
Combining ConceptNet and WordNet for Word Sense Disambiguation
Junpeng Chen
|
Juan Liu
Proceedings of 5th International Joint Conference on Natural Language Processing
pdf
Question classification based on an extended class sequential rule model
Zijing Hui
|
Juan Liu
|
Lumei Ouyang
Proceedings of 5th International Joint Conference on Natural Language Processing
2008
Mining Chinese-English Parallel Corpora from the Web
Bo Li
|
Juan Liu
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II
2007
pdf
Mining Parallel Text from the Web based on Sentence Alignment
Bo Li
|
Juan Liu
|
Huili Zhu
Proceedings of the 21st Pacific Asia Conference on Language, Information and Computation