2025
pdf
bib
abs
Lightweight Contenders: Navigating Semi-Supervised Text Mining through Peer Collaboration and Self Transcendence
Qianren Mao
|
Weifeng Jiang
|
Junnan Liu
|
Chenghua Lin
|
Qian Li
|
Xianqing Wen
|
Jianxin Li
|
Jinhu Lu
Findings of the Association for Computational Linguistics: NAACL 2025
The semi-supervised learning (SSL) strategy in lightweight models requires reducing annotated samples and facilitating cost-effective inference. However, the constraint on model parameters, imposed by the scarcity of training labels, limits the SSL performance. In this paper, we introduce PS-NET, a novel framework tailored for semi-supervised text mining with lightweight models. PS-NET incorporates online distillation to train lightweight student models by imitating the Teacher model. It also integrates an ensemble of student peers that collaboratively instruct each other. Additionally, PS-NET implements a constant adversarial perturbation schema to further self-augmentation by progressive generalizing. Our PS-NET, equipped with a 2-layer distilled BERT, exhibits notable performance enhancements over SOTA lightweight SSL frameworks of FLiText and Disco in SSL text classification with extremely rare labelled data.
pdf
bib
abs
Are Your LLMs Capable of Stable Reasoning?
Junnan Liu
|
Hongwei Liu
|
Linchen Xiao
|
Ziyi Wang
|
Kuikun Liu
|
Songyang Gao
|
Wenwei Zhang
|
Songyang Zhang
|
Kai Chen
Findings of the Association for Computational Linguistics: ACL 2025
The rapid advancement of large language models (LLMs) has shown remarkable progress in complex reasoning tasks. However, a significant disparity exists between benchmark performances and real-world applications. We attribute this gap primarily to current evaluation protocols and metrics, which inadequately capture the full spectrum of LLM capabilities, especially in complex reasoning tasks where both accuracy and consistency are essential. In this paper, we introduce **G-Pass@**k, a novel evaluation metric that continuously assesses model performance across multiple sampling attempts, quantifying both the model’s performance potential and its stability. Through extensive experiments on various public and newly constructed benchmarks, we employ G-Pass@k in conjunction with state-of-the-art large language models to provide comprehensive insights into their potential capabilities and operational consistency. Our findings reveal a significant opportunity to enhance the realistic reasoning abilities of LLMs, underscoring the necessity for more robust evaluation metrics.
pdf
bib
abs
Few-Shot Natural Language to First-Order Logic Translation via Code Generation
Junnan Liu
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Translation of natural language to first-order logical formula (NL-FOL) has recently gained significant attention for its critical role in logic-based NLP applications. Some studies attempt to utilize pretrained language models in a sequence-to-sequence manner for the NL-FOL task. However, these methods encounter challenges such as (1) inconsistency between the training and inference phases and (2) the data-intensive and resource-intensive finetuning process. This paper introduces a novel NL-FOL translation method, dubbed Code4Logic, which is based on in-context learning and employs code snippets to bridge the gap between natural language and first-order logic. By converting the translation task into a progressive code generation task, Code4Logic demonstrates strong generalization within a training-free manner, and enhances the performance of large language models (LLMs) to generate complex first-order logical formulas. Experimental results on NL-FOL task and downstream task datasets indicate that Code4Logic surpasses prominent training-free baselines and is comparable to supervised models trained on the full training data.
2023
pdf
bib
abs
LATENTLOGIC: Learning Logic Rules in Latent Space over Knowledge Graphs
Junnan Liu
|
Qianren Mao
|
Chenghua Lin
|
Yangqiu Song
|
Jianxin Li
Findings of the Association for Computational Linguistics: EMNLP 2023
Learning logic rules for knowledge graph reasoning is essential as such rules provide interpretable explanations for reasoning and can be generalized to different domains. However, existing methods often face challenges such as searching in a vast search space (e.g., enumeration of relational paths or multiplication of high-dimensional matrices) and inefficient optimization (e.g., techniques based on reinforcement learning or EM algorithm). To address these limitations, this paper proposes a novel framework called LatentLogic to efficiently mine logic rules by controllable generation in the latent space. Specifically, to map the discrete relational paths into the latent space, we leverage a pre-trained VAE and employ a discriminator to establish an energy-based distribution. Additionally, we incorporate a sampler based on ordinary differential equations, enabling the efficient generation of logic rules in our approach. Extensive experiments on benchmark datasets demonstrate the effectiveness and efficiency of our proposed method.
2022
pdf
bib
abs
Noise-injected Consistency Training and Entropy-constrained Pseudo Labeling for Semi-supervised Extractive Summarization
Yiming Wang
|
Qianren Mao
|
Junnan Liu
|
Weifeng Jiang
|
Hongdong Zhu
|
Jianxin Li
Proceedings of the 29th International Conference on Computational Linguistics
Labeling large amounts of extractive summarization data is often prohibitive expensive due to time, financial, and expertise constraints, which poses great challenges to incorporating summarization system in practical applications. This limitation can be overcome by semi-supervised approaches: consistency-training and pseudo-labeling to make full use of unlabeled data. Researches on the two, however, are conducted independently, and very few works try to connect them. In this paper, we first use the noise-injected consistency training paradigm to regularize model predictions. Subsequently, we propose a novel entropy-constrained pseudo labeling strategy to obtain high-confidence labels from unlabeled predictions, which can obtain high-confidence labels from unlabeled predictions by comparing the entropy of supervised and unsupervised predictions. By combining consistency training and pseudo-labeling, this framework enforce a low-density separation between classes, which decently improves the performance of supervised learning over an insufficient labeled extractive summarization dataset.