Xiantao Zhang


2025

pdf bib
AuraDial: A Large-Scale Human-Centric Dialogue Dataset for Chinese AI Psychological Counseling
Xiantao Zhang
Findings of the Association for Computational Linguistics: EMNLP 2025

This paper introduces AuraDial, a large-scale, human-centric dialogue dataset for Chinese AI psychological counseling, comprising over 300,000 single-turn dialogues and 90,000 multi-turn dialogue sessions. A key distinction of AuraDial is its instruction set, primarily derived from real-world user queries, better reflecting genuine expression patterns compared to synthetic or template-based alternatives. Furthermore, we propose an innovative rephrasing-based data generation methodology designed to foster more human-like and empathetic responses, addressing a common shortcoming in AI-generated dialogue. Experimental results demonstrate that models fine-tuned on AuraDial significantly outperform those trained on other public datasets in generating empathetic and relevant replies. AuraDial offers a novel, valuable resource to the Chinese NLP community for advancing AI in psychological counseling. The dataset is publicly available at [https://huggingface.co/datasets/Mxode/AuraDial](https://huggingface.co/datasets/Mxode/AuraDial).

2014

pdf bib
Learning the Taxonomy of Function Words for Parsing
Dongchen Li | Xiantao Zhang | Dingsheng Luo | Xihong Wu
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Learning Grammar with Explicit Annotations for Subordinating Conjunctions
Dongchen Li | Xiantao Zhang | Xihong Wu
Proceedings of the ACL 2014 Student Research Workshop

2013

pdf bib
Improved Chinese Parsing Using Named Entity Cue
Dongchen Li | Xiantao Zhang | Xihong Wu
Proceedings of the 13th International Conference on Parsing Technologies (IWPT 2013)