Sen Hu
2026
From Style to Story: A Curriculum Learning Approach for Imitative Novel Generation
Xueran Han | Yuhan Liu | Mingzhe Li | Wei Liu | Sen Hu | Rui Yan | Zhiqiang xu | Xiuying Chen
Findings of the Association for Computational Linguistics: ACL 2026
Xueran Han | Yuhan Liu | Mingzhe Li | Wei Liu | Sen Hu | Rui Yan | Zhiqiang xu | Xiuying Chen
Findings of the Association for Computational Linguistics: ACL 2026
Great novels create immersive worlds with rich character arcs, well-structured plots, and nuanced writing styles. However, current novel generation methods often rely on brief, simplistic story outlines and generate details using plain, generic language.To bridge this gap, we introduce the task of Imitative Novel Generation, which requires the generated novels to imitate the distinctive features of the original work, including understanding character profiles and world views, predicting plausible plot developments, and writing concrete details using vivid, expressive language.To achieve this, we propose WriterAgent, a novel generation system designed to master the core aspects of literary imitative.WriterAgent is trained through a curriculum learning paradigm, progressing from low-level stylistic mastery to high-level narrative coherence. Its key tasks include language style learning, character modeling, plot planning, and stylish writing, ensuring comprehensive narrative control.To support this, WriterAgent leverages the WriterLoRA framework, an extension of LoRA with hierarchical and cumulative task-specific modules, each specializing in a different narrative aspect. We evaluate WriterAgent on multilingual classics like Harry Potter and Dream of the Red Chamber, demonstrating its superiority over baselines in capturing the target author’s settings, character dynamics, and writing style to produce coherent, faithful narratives.We hope this work inspires literary creativity in NLP: WriterAgent.
CloneMem: Benchmarking Long-Term Memory for AI Clones
Sen Hu | Zhiyu Zhang | Yuxiang Wei | Xueran Han | Zhenheng Tang | Ronghao Chen | Huacan Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sen Hu | Zhiyu Zhang | Yuxiang Wei | Xueran Han | Zhenheng Tang | Ronghao Chen | Huacan Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
AI Clones aim to simulate an individual’s thoughts and behaviors to enable long-term, personalized interaction, placing stringent demands on memory systems to model experiences, emotions, and opinions over time. Existing memory benchmarks primarily rely on user–agent conversational histories, which are temporally fragmented and insufficient for capturing continuous life trajectories. We introduce CloneMem, a benchmark for evaluating long-term memory in AI Clone scenarios grounded in non-conversational digital traces, including diaries, social media posts, and emails, spanning one to three years. CloneMem adopts a top-down data construction framework to ensure longitudinal coherence and defines tasks that assess an agent’s ability to track evolving personal states. Experiments show that current memory mechanisms struggle in this setting, highlighting open challenges for life-grounded personalized AI. Code and dataset are available at https://github.com/AvatarMemory/CloneMemBench
KnowMe-Bench: Benchmarking Person Understanding for Lifelong Digital Companions
Tingyu Wu | Zhisheng Chen | Ziyan Weng | Shuhe Wang | Shuo Zhang | Sen Hu | Silin Wu | Qizhen Lan | Huacan Wang | Ronghao Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Tingyu Wu | Zhisheng Chen | Ziyan Weng | Shuhe Wang | Shuo Zhang | Sen Hu | Silin Wu | Qizhen Lan | Huacan Wang | Ronghao Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Existing long-horizon memory benchmarks mostly use multi-turn dialogues or synthetic user histories, which makes retrieval performance an imperfect proxy for person understanding. We present Knowme-Bench, a publicly releasable benchmark built from long-form autobiographical narratives, where actions, context, and inner thoughts provide dense evidence for inferring stable motivations and decision principles. Knowme-Bench reconstructs each narrative into a flashback-aware, time-anchored stream and evaluates models with evidence-linked questions spanning factual recall, subjective state attribution, and principle-level reasoning. Across diverse narrative sources, retrieval-augmented systems mainly improve factual accuracy, while errors persist on temporally grounded explanations and higher-level inferences, highlighting the need for memory mechanisms beyond retrieval.
Does Memory Need Graphs? A Unified Framework and Empirical Analysis for Long-Term Dialog Memory
Sen Hu | Yuxiang Wei | Jiaxin Ran | Xueran Han | Zhiyuan Yao | Huacan Wang | Ronghao Chen | Lei Zou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sen Hu | Yuxiang Wei | Jiaxin Ran | Xueran Han | Zhiyuan Yao | Huacan Wang | Ronghao Chen | Lei Zou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction
Haonan Bian | Zhiyuan Yao | Sen Hu | Zishan Xu | Shaolei Zhang | Yifu Guo | Ziliang Yang | Xueran Han | Huacan Wang | Ronghao Chen
Findings of the Association for Computational Linguistics: ACL 2026
Haonan Bian | Zhiyuan Yao | Sen Hu | Zishan Xu | Shaolei Zhang | Yifu Guo | Ziliang Yang | Xueran Han | Huacan Wang | Ronghao Chen
Findings of the Association for Computational Linguistics: ACL 2026
As Large Language Models (LLMs) evolve from static dialogue interfaces to autonomous general agents, effective memory is paramount to ensuring long-term consistency. However, existing benchmarks primarily focus on casual conversation or task-oriented dialogue, failing to capture “long-term project-oriented” interactions where agents must track evolving goals. To bridge this gap, we introduce RealMem, the first benchmark grounded in realistic project scenarios. RealMem comprises over 2,000 cross-session dialogues across eleven scenarios, utilizing natural user queries for evaluation. We propose a synthesis pipeline that integrates Project Foundation Construction, Multi-Agent Dialogue Generation, and Memory and Schedule Management to simulate the dynamic evolution of memory. Experiments reveal that current memory systems face significant challenges in managing the long-term project states and dynamic context dependencies inherent in real-world projects. Our code and datasets are available at https://anonymous.4open.science/r/realmem-A1E4.
2024
Are LLM-based Evaluators Confusing NLG Quality Criteria?
Xinyu Hu | Mingqi Gao | Sen Hu | Yang Zhang | Yicheng Chen | Teng Xu | Xiaojun Wan
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xinyu Hu | Mingqi Gao | Sen Hu | Yang Zhang | Yicheng Chen | Teng Xu | Xiaojun Wan
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Some prior work has shown that LLMs perform well in NLG evaluation for different tasks. However, we discover that LLMs seem to confuse different evaluation criteria, which reduces their reliability. For further verification, we first consider avoiding issues of inconsistent conceptualization and vague expression in existing NLG quality criteria themselves. So we summarize a clear hierarchical classification system for 11 common aspects with corresponding different criteria from previous studies involved. Inspired by behavioral testing, we elaborately design 18 types of aspect-targeted perturbation attacks for fine-grained analysis of the evaluation behaviors of different LLMs. We also conduct human annotations beyond the guidance of the classification system to validate the impact of the perturbations. Our experimental results reveal confusion issues inherent in LLMs, as well as other noteworthy phenomena, and necessitate further research and improvements for LLM-based evaluation.
2023
Improving Knowledge Production Efficiency With Question Answering on Conversation
Changlin Yang | Siye Liu | Sen Hu | Wangshu Zhang | Teng Xu | Jing Zheng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Changlin Yang | Siye Liu | Sen Hu | Wangshu Zhang | Teng Xu | Jing Zheng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)
Through an online customer service application, we have collected many conversations between customer service agents and customers. Building a knowledge production system can help reduce the labor cost of maintaining the FAQ database for the customer service chatbot, whose core module is question answering (QA) on these conversations. However, most existing researches focus on document-based QA tasks, and there is a lack of researches on conversation-based QA and related datasets, especially in Chinese language. The challenges of conversation-based QA include: 1) answers may be scattered among multiple dialogue turns; 2) understanding complex dialogue contexts is more complicated than documents. To address these challenges, we propose a multi-span extraction model on this task and introduce continual pre-training and multi-task learning schemes to further improve model performance. To validate our approach, we construct two Chinese datasets using dialogues as the knowledge source, namely cs-qaconv and kd-qaconv, respectively. Experimental results demonstrate that the proposed model outperforms the baseline on both datasets. The online application also verifies the effectiveness of our method. The dataset kd-qaconv will be released publicly for research purposes.
AdapterDistillation: Non-Destructive Task Composition with Knowledge Distillation
Junjie Wang | Yicheng Chen | Wangshu Zhang | Sen Hu | Teng Xu | Jing Zheng
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Junjie Wang | Yicheng Chen | Wangshu Zhang | Sen Hu | Teng Xu | Jing Zheng
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Leveraging knowledge from multiple tasks through introducing a small number of task specific parameters into each transformer layer, also known as adapters, receives much attention recently. However, adding an extra fusion layer to implement knowledge composition not only increases the inference time but also is non-scalable for some applications. To avoid these issues, we propose a two-stage knowledge distillation algorithm called AdapterDistillation. In the first stage, we extract task specific knowledge by using local data to train a student adapter. In the second stage, we distill the knowledge from the existing teacher adapters into the student adapter to help its inference. Extensive experiments on frequently asked question retrieval in task-oriented dialog systems validate the efficiency of AdapterDistillation. We show that AdapterDistillation outperforms existing algorithms in terms of accuracy, resource consumption and inference time.
2021
NAMER: A Node-Based Multitasking Framework for Multi-Hop Knowledge Base Question Answering
Minhao Zhang | Ruoyu Zhang | Lei Zou | Yinnian Lin | Sen Hu
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations
Minhao Zhang | Ruoyu Zhang | Lei Zou | Yinnian Lin | Sen Hu
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations
We present NAMER, an open-domain Chinese knowledge base question answering system based on a novel node-based framework that better grasps the structural mapping between questions and KB queries by aligning the nodes in a query with their corresponding mentions in question. Equipped with techniques including data augmentation and multitasking, we show that the proposed framework outperforms the previous SoTA on CCKS CKBQA dataset. Moreover, we develop a novel data annotation strategy that facilitates the node-to-mention alignment, a dataset (https://github.com/ridiculouz/CKBQA) with such strategy is also published to promote further research. An online demo of NAMER (http://kbqademo.gstore.cn) is provided to visualize our framework and supply extra information for users, a video illustration (https://youtu.be/yetnVye_hg4) of NAMER is also available.
2018
A State-transition Framework to Answer Complex Questions over Knowledge Base
Sen Hu | Lei Zou | Xinbo Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Sen Hu | Lei Zou | Xinbo Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Although natural language question answering over knowledge graphs have been studied in the literature, existing methods have some limitations in answering complex questions. To address that, in this paper, we propose a State Transition-based approach to translate a complex natural language question N to a semantic query graph (SQG), which is used to match the underlying knowledge graph to find the answers to question N. In order to generate SQG, we propose four primitive operations (expand, fold, connect and merge) and a learning-based state transition approach. Extensive experiments on several benchmarks (such as QALD, WebQuestions and ComplexQuestions) with two knowledge bases (DBpedia and Freebase) confirm the superiority of our approach compared with state-of-the-arts.
Search
Fix author
Co-authors
- Ronghao Chen 4
- Xueran Han 4
- Huacan Wang 4
- Teng Xu 3
- Lei Zou 3
- Yicheng Chen 2
- Yuxiang Wei 2
- Zhiyuan Yao 2
- Wangshu Zhang 2
- Jing Zheng 2
- Haonan Bian 1
- Xiuying Chen 1
- Zhisheng Chen 1
- Mingqi Gao 1
- Yifu Guo 1
- Xinyu Hu 1
- Qizhen Lan 1
- Mingzhe Li 1
- Yinnian Lin 1
- Yuhan Liu 1
- Wei Liu 1
- Siye Liu 1
- Jiaxin Ran 1
- Zhenheng Tang 1
- Xiaojun Wan 1
- Junjie Wang 1
- Shuhe Wang 1
- Ziyan Weng 1
- Tingyu Wu 1
- Silin Wu 1
- Zishan Xu 1
- Rui Yan 1
- Changlin Yang 1
- Ziliang Yang 1
- Xinbo Zhang 1
- Zhiyu Zhang 1
- Shuo Zhang 1
- Yang Zhang 1
- Minhao Zhang 1
- Ruoyu Zhang 1
- Shaolei Zhang 1
- Zhiqiang xu 1