Chang Yao

2025

Generating high-quality Multiple Choice Questions (MCQs) remains challenging for educational tools due to the need for contextual relevance and plausible distractors. Existing methods still struggle with these dual requirements, leading to questions that lack depth and distractors that are either too obvious or irrelevant. In this paper, we propose BiFlow, a novel framework that integrates bidirectional reasoning perspectives: teacher reasoning generates contextually relevant questions and plausible distractors, while student reasoning evaluates question clarity and the misleading nature of the distractors. To further enhance reasoning, we introduce PathFinder, a mechanism that employs breadth-first search and Chain-of-Thought (CoT) strategies to explore diverse reasoning paths, improving both the quality and diversity of generated questions and distractors. Additionally, we enrich the FairytaleQA dataset to FairytaleMCQ with high-quality distractors, providing a robust benchmark for MCQ generation. Experimental results demonstrate that BiFlow outperforms existing methods, particularly in generating text-grounded questions and high-quality distractors for narrative contexts, highlighting its value in educational applications.

Multimodal learning is garnering significant attention for its capacity to represent diverse human perceptions (e.g., linguistic, acoustic, and visual signals), achieving more natural and intuitive interactions with technology.However, the frequent occurrence of incomplete data, either within a single modality (intra-modality) or across different modalities (inter-modality), presents substantial challenges in reliable semantic interpretation and model reasoning.Furthermore, there is currently no robust representation learning mechanism capable of managing both intra-modality and inter-modality real-data deficiencies.To address this challenge, we present T²DR, a two-tier deficiency-resistant framework for incomplete multimodal learning, which comprises two main modules:(1) Intra-Modal Deficiency-Resistant module (IADR): To address fine-grained deficiencies, we introduce Intra-Attn to focus on the available data while avoiding excessive suppression of the missing regions.(2) Inter-Modal Deficiency-Resistant module (IEDR): To handle coarse-grained deficiencies, we propose the shared feature prediction (SFP) to leverage cross-modal shared features for preliminary data imputation. Subsequently, we apply Inter-Attn to allocate appropriate attention to each modality based on the results from the capability-aware scorer (CAS).Extensive experiments are performed on two well-known multimodal benchmarks, CMU-MOSI and CMU-MOSEI, across various missing scenarios for sentiment analysis. Experimental results show that T²DR significantly outperforms the SOTA models. Code is available at https://github.com/LH019/T2DR.

pdf bib abs
CoLA: Collaborative Low-Rank Adaptation
Yiyun Zhou | Chang Yao | Jingyuan Chen
Findings of the Association for Computational Linguistics: ACL 2025

The scaling law of Large Language Models (LLMs) reveals a power-law relationship, showing diminishing return on performance as model scale increases. While training LLMs from scratch is resource-intensive, fine-tuning a pre-trained model for specific tasks has become a practical alternative. Full fine-tuning (FFT) achieves strong performance; however, it is computationally expensive and inefficient. Parameter-efficient fine-tuning (PEFT) methods, like LoRA, have been proposed to address these challenges by freezing the pre-trained model and adding lightweight task-specific modules. LoRA, in particular, has proven effective, but its application to multi-task scenarios is limited by interference between tasks. Recent approaches, such as Mixture-of-Experts (MOE) and asymmetric LoRA, have aimed to mitigate these issues but still struggle with sample scarcity and noise interference due to their fixed structure. In response, we propose CoLA, a more flexible LoRA architecture with an efficient initialization scheme, which introduces three collaborative strategies to enhance performance by better utilizing the quantitative relationships between matrices A and B. Our experiments demonstrate the effectiveness and robustness of CoLA, outperforming existing PEFT methods, especially in low-sample scenarios. Our data and code are fully publicly available: https://github.com/zyy-2001/CoLA.

With the continuous development of language models and the widespread availability of various types of accessible interfaces, large language models (LLMs) have been applied to an increasing number of fields. However, due to the vast amounts of data and computational resources required for model development, protecting the model’s parameters and training data has become an urgent and crucial concern. Due to the revolutionary training and application paradigms of LLMs, many new attacks on language models have emerged in recent years. In this paper, we define these attacks as “reverse engineering” (RE) techniques on LMs and aim to provide an in-depth analysis of reverse engineering of language models. We illustrate various methods of reverse engineering applied to different aspects of a model, while also providing an introduction to existing protective strategies. On the one hand, it demonstrates the vulnerabilities of even black box models to different types of attacks; on the other hand, it offers a more holistic perspective for the development of new protective strategies for models.

Large Language Models (LLMs) have demonstrated strong performance in open-ended generation tasks. However, they often struggle to adapt content to users with differing cognitive capacities, leading to a phenomenon we term cognitive misalignment. This issue arises in two forms: knowledge-level misalignment, where content is too complex or too simplistic relative to user understanding, and presentation style misalignment, where the structure or tone hinders effective comprehension. To address these challenges, we propose the Cognitive-Level Alignment Framework (CLAF), a general-purpose generation framework that aligns both knowledge complexity and presentation style with user cognition. CLAF integrates a capability-aware retrieval module based on a hierarchical knowledge graph and a style optimization module guided by Bloom’s taxonomy and preference learning. Additionally, a knowledge-controllable generation component ensures consistency and relevance throughout the output. To support training and evaluation, we construct Scale, a cognitively annotated dataset containing responses at multiple comprehension levels per query. Empirical results show that CLAF enhances the adaptability and informativeness of LLM outputs across a range of user profiles, offering a robust solution to cognitive-level alignment in real-world applications.

2024

pdf bib abs
MPCoder: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
Zhenlong Dai | Chang Yao | WenKang Han | Yuanying Yuanying | Zhipeng Gao | Jingyuan Chen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large Language Models (LLMs) have demonstrated great potential for assisting developers in their daily development. However, most research focuses on generating correct code, how to use LLMs to generate personalized code has seldom been investigated. To bridge this gap, we proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users. To better learn coding style features, we utilize explicit coding style residual learning to capture the syntax code style standards and implicit style learning to capture the semantic code style conventions. We train a multi-user style adapter to better differentiate the implicit feature representations of different users through contrastive learning, ultimately enabling personalized code generation for multiple users. We further propose a novel evaluation metric for estimating similarities between codes of different coding styles. The experimental results show the effectiveness of our approach for this novel task.