Chang Yang

Other people with similar names: Chang Yang

Unverified author pages with similar names: Chang Yang

2026

Idiomatic Expression Generation, which aims to produce idiomatic text from plain text, is a valuable yet challenging NLP task. However, existing methods suffer from the scarcity of parallel data and dependence on high-quality manual annotations. To address this, we propose an iterative LLM-SLM (Large Language Model-Small Language Model) collaborative framework — Auto-IDEA, that replaces human supervision for idiomatic expression data generation. In this self-improving cycle, the LLM constructs parallel corpora (idiomatic and plain text) via bidirectional semantic reconstruction, automatically generating "Locate-Then-Polish" (LTP) annotations; the SLM filters low-quality corpora while continuously enhancing its verification ability through incremental learning. We instantiate Auto-IDEA for Chinese Idiom Polishing (CIP), constructing CIP-200K, a large-scale dataset of 206K parallel sentences with LTP annotations. The Qwen3-8B fine-tuned on CIP-200K achieves a 25.2% absolute Idiom Polishing Accuracy (IPA) improvement over a supervised fine-tuning (SFT) baseline, outperforming DeepSeek-R1 by 6.2%. Extensive experiments (e.g., Chinese idiom cloze tests and English idiom generation tasks) and human evaluations verify the generalization and effectiveness of Auto-IDEA, demonstrating a new pathway for high-quality, annotation-free data generation through LLM-SLM collaboration.

2025

pdf bib abs

Rethink Rumor Detection in the Era of LLMs: A Review
Chang Yang | Peng Zhang | Jing Zhang | Hui Gao | Changhao Song
Findings of the Association for Computational Linguistics: EMNLP 2025

The rise of large language models (LLMs) has fundamentally reshaped the technological paradigm of rumor detection, offering transformative opportunities to construct adaptive detection systems while simultaneously ushering in new threats, such as “logically perfect rumors”. This paper aims to unify existing methods in the field of rumor detection and reveal the logical mechanisms behind them. From the perspective of complex systems, we innovatively propose a Cognition-Interaction-Behavior (CIB) tri-level framework for rumor detection based on collective intelligence and explore the synergistic relationship between LLMs and collective intelligence in rumor governance. We identify promising future research directions, including advancing agent-based modeling to capture complex rumor dynamics, addressing emerging challenges unique to the LLM era, and interdisciplinary perspectives. We hope this work lays a theoretical foundation for next-generation rumor detection paradigms and offers valuable insights for advancing the field.

2024

pdf bib abs

Deciphering Rumors: A Multi-Task Learning Approach with Intent-aware Hierarchical Contrastive Learning
Chang Yang | Peng Zhang | Hui Gao | Jing Zhang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Social networks are rife with noise and misleading information, presenting multifaceted challenges for rumor detection. In this paper, from the perspective of human cognitive subjectivity, we introduce the mining of individual latent intentions and propose a novel multi-task learning framework, the Intent-Aware Rumor Detection Network (IRDNet). IRDNet is designed to discern multi-level rumor semantic features and latent user intentions, addressing the challenges of robustness and key feature mining and alignment that plague existing models. In IRDNet, the multi-level semantic extraction module captures sequential and hierarchical features to generate robust semantic representations. The hierarchical contrastive learning module incorporates two complementary strategies, event-level and intent-level, to establish cognitive anchors that uncover the latent intentions of information disseminators. Event-level contrastive learning employs high-quality data augmentation and adversarial perturbations to enhance model robustness. Intent-level contrastive learning leverages the intent encoder to capture latent intent features and optimize consistency within the same intent while ensuring heterogeneity between different intents to clearly distinguish key features from irrelevant elements. Experimental results demonstrate that IRDNet significantly improves the effectiveness of rumor detection and effectively addresses the challenges present in the field of rumor detection.

Co-authors

Venues

Fix author