2025
pdf
bib
abs
Exploring Knowledge Filtering for Retrieval-Augmented Discriminative Tasks
Minjie Qiang
|
Zhongqing Wang
|
Xiaoyi Bao
|
HaoYuan Ma
|
Shoushan Li
|
Guodong Zhou
Findings of the Association for Computational Linguistics: ACL 2025
Retrieval-augmented methods have achieved remarkable advancements in alleviating the hallucination of large language models.Nevertheless, the introduction of external knowledge does not always lead to the expected improvement in model performance, as irrelevant or harmful information present in the retrieved knowledge can compromise the prediction process.To address these challenges, we propose a novel framework aimed at improving model performance by incorporating knowledge filtering and prediction fusion mechanisms.In particular, our approach first employs a perplexity-based annotation method to collect training data.Then, we design four distinct strategies to filter out harmful retrieved knowledge.Finally, we integrate the filtered knowledge to generate the final result via batch-wise predictions.We conduct extensive experiments across multiple discriminative task datasets to evaluate the proposed framework.The results demonstrate that our framework can significantly enhance the performance of models on discriminative tasks.
pdf
bib
abs
Logic: Long-form Outline Generation via Imitative and Critical Self-refinement
Hengwei Liu
|
Yongliang Shen
|
Zhe Zheng
|
Haoyuan Ma
|
Xingyu Wu
|
Yin Zhang
|
Weiming Lu
Findings of the Association for Computational Linguistics: EMNLP 2025
Long-form outline generation for expository articles requires both comprehensive knowledge coverage and logical coherence, which is essential for creating detailed Wikipedia-like content. However, existing methods face critical limitations: outlines generated in the pre-writing stage often have low knowledge density and lack detail, while retrieval-augmented approaches struggle to maintain logical coherence across retrieved information. Additionally, unlike human writers who can iteratively improve through peer feedback and reference similar topics, current approaches lack effective mechanisms for systematic outline refinement. To address these challenges, we propose Logic, a Long-form Outline Generation system via Imitative and Critical self-refinement that mimics human writers’ refinement process. Logic establishes a coherent planning framework and structured knowledge base, learns from similar topic outlines through imitation, and continuously improves through model-based critique. Experiments on FreshWiki and our dataset WikiOutline show that, compared to the best baseline, Logic’s long-form outlines are more organized (with increases of 22.85% and 21.65% respectively) and more logically coherent (with increases of 16.19% and 12.24% respectively). Human evaluation further validates Logic’s effectiveness in generating comprehensive and well-structured long-form outlines.
pdf
bib
abs
DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL
Haoyuan Ma
|
Yongliang Shen
|
Hengwei Liu
|
Wenqi Zhang
|
Haolei Xu
|
Qiuying Peng
|
Jun Wang
|
Weiming Lu
Findings of the Association for Computational Linguistics: EMNLP 2025
Recent text-to-SQL systems powered by large language models (LLMs) have demonstrated remarkable performance in translating natural language queries into SQL.However, these systems often struggle with complex database structures and domain-specific queries, as they primarily focus on enhancing logical reasoning and SQL syntax while overlooking the critical need for comprehensive database understanding.To address this limitation, we propose DB-Explore, a novel framework that systematically aligns LLMs with database knowledge through automated exploration and instruction synthesis.DB-Explore constructs database graphs to capture complex relational schemas, leverages GPT-4 to systematically mine structural patterns and semantic knowledge, and synthesizes instructions to distill this knowledge for efficient fine-tuning of LLMs.Our framework enables comprehensive database understanding through diverse sampling strategies and automated instruction generation, bridging the gap between database structures and language models.Experiments conducted on the SPIDER and BIRD benchmarks validate the effectiveness of DB-Explore, achieving an execution accuracy of 67.0% on BIRD and 87.8% on SPIDER. Notably, our open‐source implementation based on Qwen2.5‐Coder‐7B achieves state‐of‐the‐art results at minimal computational cost, outperforming several GPT‐4‐driven Text‐to‐SQL systems.