Yuqing Ma


2026

Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, their effectiveness heavily relies on supervised training with extensive labeled (e.g., question-answering pairs) or unlabeled datasets (e.g., code snippets), which are often expensive and difficult to obtain at scale. To address this limitation, this paper introduces a method IPC, an unsupervised framework that leverages Internal Probing of LLMs for Code generation without any external corpus, even unlabeled code snippets. We introduce the problem space probing, test understanding probing, solution space probing, and knowledge consolidation and reinforcement to probe the internal knowledge and confidence patterns existing in LLMs. Further, IPC identifies reliable code candidates through self-consistency mechanisms and representation-based quality estimation to train UCoder (coder with unsupervised learning). We validate the proposed approach across multiple code benchmarks, demonstrating that unsupervised methods can achieve competitive performance compared to supervised approaches while significantly reducing the dependency on labeled data and computational resources. Analytic experiments reveal that internal model states contain rich signals about code quality and correctness, and that properly harnessing these signals enables effective unsupervised learning for code generation tasks, opening new directions for training code LLMs in resource-constrained scenarios.

2025

Intervening the internal activations of large language models (LLMs) provides an effective inference-time alignment approach to mitigate undesirable behaviors, such as generating erroneous or harmful content, thereby ensuring safe and reliable applications of LLMs. However, previous methods neglect the misalignment discrepancy among varied tokens, resulting in deviant alignment direction and inflexible editing strength. To address these issues, we propose a token-aware editing (TAE) approach to fully utilize token-level alignment information in the activation space, therefore realizing superior post-intervention performance. Specifically, a Mutual Information-guided Graph Aggregation (MIG) module first develops an MI-guided graph to exploit the tokens’ informative interaction for activation enrichment, thus improving alignment probing and facilitating intervention. Subsequently, Misalignment-aware Adaptive Intervention (MAI) comprehensively perceives the token-level misalignment degree from token representation and prediction to guide the adaptive adjustment of editing strength, thereby enhancing final alignment performance. Extensive experiments on three alignment capabilities demonstrate the efficacy of TAE, notably surpassing baseline by 25.8% on the primary metric of truthfulness with minimal cost.
Retrieval-Augmented Generation (RAG) has proven effective in enhancing the factuality of LLMs’ generation, making them a focal point of research. However, previous RAG approaches overlook the lexical diversity of queries, hindering their ability to achieve a granular relevance assessment between queries and retrieved documents, resulting in suboptimal performance. In this paper, we introduce a Lexical Diversity-aware RAG (DRAG) method to address the biases in relevant information retrieval and utilization induced by lexical diversity. Specifically, a Diversity-sensitive Relevance Analyzer is proposed to decouple and assess the relevance of different query components (words, phrases) based on their levels of lexical diversity, ensuring precise and comprehensive document retrieval. Moreover, a Risk-guided Sparse Calibration strategy is further introduced to calibrate the generated tokens that is heavily affected by irrelevant content. Through these modules, DRAG is capable of effectively retrieving relevant documents and leverages their pertinent knowledge to refine the original results and generate meaningful outcomes. Extensive experiments on widely used benchmarks demonstrate the efficacy of our approach, yielding a 10.6% accuracy improvement on HotpotQA.

2023

In this paper, we propose a new knowledge distillation approach called adaptive contrastive knowledge distillation (ACKD) for BERT compression. Different from existing knowledge distillation methods for BERT that implicitly learn discriminative student features by mimicking the teacher features, we first introduce a novel contrastive distillation loss (CDL) based on hidden state features in BERT as the explicit supervision to learn discriminative student features. We further observe sentences with similar features may have completely different meanings, which makes them hard to distinguish. Existing methods do not pay sufficient attention to these hard samples with less discriminative features. Therefore, we propose a new strategy called sample adaptive reweighting (SAR) to adaptively pay more attention to these hard samples and strengthen their discrimination abilities. We incorporate our SAR strategy into our CDL and form the adaptive contrastive distillation loss, based on which we construct our ACKD framework. Comprehensive experiments on multiple natural language processing tasks demonstrate the effectiveness of our ACKD framework.