Wenmin Deng

2025

In this paper, we present a novel pipeline for the XLLM Shared Task-III: Large Language Model for Structural Reasoning (LLM-SR). Our pipeline addresses key challenges in automatic process-reward training data construction, such as high manual annotation costs, limited accuracy of large models in structured data processing, and dependency on auxiliary information for validation. To overcome these limitations, we first decompose the construction process into extraction and validation phases. Leveraging model-generated annotations, we produce pseudo-labeled data and iteratively refine model performance. Second, by analyzing structured data patterns, we encode structural constraints into a rule-based module and fine-tune the model with Gradient Reward Policy Optimization (GRPO), significantly improving structured data extraction success rates. Finally, we train the model to generate critical responses that assess evidence-conclusion relationships, thus enhancing validation reliability. Experimental results demonstrate that our pipeline outperforms models with an order of magnitude more parameters and achieves the first position on the task.

2024

pdf bib abs
LA-UCL: LLM-Augmented Unsupervised Contrastive Learning Framework for Few-Shot Text Classification
Jing Zhang | Hui Gao | Peng Zhang | Boda Feng | Wenmin Deng | Yuexian Hou
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The few-shot tasks require the model to have the ability to generalize from a few samples. However, due to the lack of cognitive ability, the current works cannot fully utilize limited samples to expand the sample space and still suffer from overfitting issues. To address the problems, we propose a LLM-Augmented Unsupervised Contrastive Learning Framework (LA-UCL), which introduces a cognition-enabled Large Language Model (LLM) for efficient data augmentation, and presents corresponding contrastive learning strategies. Specifically, in the self-augmented contrastive learning module, we construct a retrieval-based in-context prompt scheme by retrieving similar but different category data from the original samples, guiding the LLM to generate more discriminative augmented data. Then, by designing group-level contrastive loss to enhance the model’s discriminative ability. In the external-augmented contrastive learning module, we utilize web knowledge retrieval to expand the sample space and leverage LLM to generate more diverse data, and introduce sample-level contrastive loss for unlabeled data to improve the model’s generalization. Experimental results on six datasets show that our model exceeds the baseline models.