Zhong Ming

2026

HSUGA: LLM-Enhanced Recommendation with Hierarchical Semantic Understanding and Group-Aware Alignment
Guorui Li | Dugang Liu | Lei Li | Xing Tang | Zhong Ming
Findings of the Association for Computational Linguistics: ACL 2026

Large language model (LLM)-enhanced sequential recommendation typically aims to improve two core components: user semantic embedding extraction and utilization. Despite promising results, existing methods still have two limitations: 1) In the extraction stage, most methods directly input long interaction sequence fragments into LLM for preference summarization. However, excessively long sequences increase inference difficulty, making it challenging to infer accurate user embeddings reliably. 2) In the utilization stage, most methods employ the same semantic embedding utilization strategy for all users, neglecting the differences caused by user activity levels, leading to suboptimal performance. To address these issues, we propose HSUGA, which introduces a simple yet effective plugin for each of the two core components: Hierarchical Semantic Understanding (HSU) and Group-Aware Alignment (GAA). HSU performs a staged two-phase preference mining and models preference evolution through constrained editing operations, thereby improving the reliability of user semantic extraction. GAA adjusts the semantic utilization intensity based on user activity levels, providing weaker alignment for active users and stronger guidance for users with sparse historical data. Finally, extensive experiments on three benchmark datasets demonstrate the effectiveness and compatibility of HSUGA.

pdf bib abs

Large Language Models (LLMs) often exhibit extreme sensitivity to surface-level prompt variations, where minor lexical perturbations trigger disproportionate performance fluctuations. Moving beyond black-box optimization or coarse-grained templates, we conduct the first analysis of n-gram token-level mechanisms, leveraging a large-scale dataset of 132,000 prompt variants. Our investigation uncovers the Scaling Law of Prompt Performance Stability: higher average performance is inherently associated with lower variance and greater stability. We identify that this robustness is driven by two linguistic pillars: Domain-Specific Terminology, which anchors semantic boundaries, and Explicit Action Directives, which formalize reasoning trajectories. By narrowing the model’s interpretative space, these patterns effectively "lock" the generation process. We operationalize these findings into an automated Prompt-Refining Agent that autonomously restructures queries via domain anchoring and operational constraints. Empirical results show a 40.7% reduction in performance variance for code generation, offering a statistically grounded framework for robust prompt engineering.

2025

pdf bib abs

Enhancing Text-to-SQL with Question Classification and Multi-Agent Collaboration
Zhihui Shao | Shubin Cai | Rongsheng Lin | Zhong Ming
Findings of the Association for Computational Linguistics: NAACL 2025

Large Language Models (LLMs) have recently demonstrated remarkable performance in Text-to-SQL tasks. However, existing research primarily focuses on the optimization of prompts and improvements in workflow, with few studies delving into the exploration of the questions. In this paper, we propose a Text-to-SQL framework based on question classification and multi-agent collaboration (QCMA-SQL). Specifically, we first employ multiple cross-attention mechanisms to train a schema selector to classify questions and select the most suitable database schema. Subsequently, we employ the appropriate agents based on the varying difficulty levels of the questions to generate preliminary SQL queries. Moreover, we implement syntax validation and execution optimization steps to generate final SQL queries. Experimental results on the Spider dataset show that the QCMA-SQL framework achieves an execution accuracy of 87.4%, outperforming state-of-the-art methods. Through ablation studies, we find that classifying the questions ultimately leads to a 2.8% increase in execution accuracy.

2022

pdf bib abs

Weighted self Distillation for Chinese word segmentation
Rian He | Shubin Cai | Zhong Ming | Jialei Zhang
Findings of the Association for Computational Linguistics: ACL 2022

Recent researches show that multi-criteria resources and n-gram features are beneficial to Chinese Word Segmentation (CWS). However, these methods rely heavily on such additional information mentioned above and focus less on the model itself. We thus propose a novel neural framework, named Weighted self Distillation for Chinese word segmentation (WeiDC). The framework, which only requires unigram features, adopts self-distillation technology with four hand-crafted weight modules and two teacher models configurations. Experiment results show that WeiDC can make use of character features to learn contextual knowledge and successfully achieve state-of-the-art or competitive performance in terms of strictly closed test settings on SIGHAN Bakeoff benchmark datasets. Moreover, further experiments and analyses also demonstrate the robustness of WeiDC. Source codes of this paper are available on Github.

pdf bib abs

Augmenting Legal Judgment Prediction with Contrastive Case Relations
Dugang Liu | Weihao Du | Lei Li | Weike Pan | Zhong Ming
Proceedings of the 29th International Conference on Computational Linguistics

Existing legal judgment prediction methods usually only consider one single case fact description as input, which may not fully utilize the information in the data such as case relations and frequency. In this paper, we propose a new perspective that introduces some contrastive case relations to construct case triples as input, and a corresponding judgment prediction framework with case triples modeling (CTM). Our CTM can more effectively utilize beneficial information to refine the encoding and decoding processes through three customized modules, including the case triple module, the relational attention module, and the category decoder module. Finally, we conduct extensive experiments on two public datasets to verify the effectiveness of our CTM, including overall evaluation, compatibility analysis, ablation studies, analysis of gain source and visualization of case representations.