Feng Wu
Papers on this page may belong to the following people: Feng Wu, Feng Wu
2025
HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference
Ping Gong | Jiawei Yi | Shengnan Wang | Juncheng Zhang | Zewen Jin | Ouxiang Zhou | Ruibo Liu | Guanbin Xu | Youhui Bai | Bowen Ye | Kun Yuan | Tong Yang | Gong Zhang | Renhai Chen | Feng Wu | Cheng Li
Findings of the Association for Computational Linguistics: ACL 2025
Ping Gong | Jiawei Yi | Shengnan Wang | Juncheng Zhang | Zewen Jin | Ouxiang Zhou | Ruibo Liu | Guanbin Xu | Youhui Bai | Bowen Ye | Kun Yuan | Tong Yang | Gong Zhang | Renhai Chen | Feng Wu | Cheng Li
Findings of the Association for Computational Linguistics: ACL 2025
Large Language Models (LLMs) have emerged as a pivotal research area, yet the attention module remains a critical bottleneck in LLM inference, even with techniques like KVCache to mitigate redundant computations. While various top-k attention mechanisms have been proposed to accelerate LLM inference by exploiting the inherent sparsity of attention, they often struggled to strike a balance between efficiency and accuracy. In this paper, we introduce HATA (Hash-Aware Top-k Attention), a novel approach that systematically integrates low-overhead learning-to-hash techniques into the Top-k attention process. Different from the existing top-k attention methods which are devoted to seeking an absolute estimation of qk score, typically with a great cost, HATA maps queries and keys into binary hash codes, and acquires the relative qk score order with a quite low cost, which is sufficient for realizing top-k attention. Extensive experiments demonstrate that HATA achieves up to 7.2× speedup compared to vanilla full attention while maintaining model accuracy. In addition, HATA outperforms the state-of-the-art top-k attention methods in both accuracy and efficiency across multiple mainstream LLM models and diverse tasks. HATA is open source at https://github.com/gpzlx1/HATA.
2024
QDMR-based Planning-and-Solving Prompting for Complex Reasoning Tasks
Jinfeng Huang | Qiaoqiao She | Wenbin Jiang | Hua Wu | Yang Hao | Tong Xu | Feng Wu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Jinfeng Huang | Qiaoqiao She | Wenbin Jiang | Hua Wu | Yang Hao | Tong Xu | Feng Wu
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Chain-of-Thought prompting has improved reasoning capability of large language models (LLM). However, it still is challenging to guarantee the effectiveness and stability for questions requiring complicated reasoning. Recently, Plan-and-Solve prompting enhances the reasoning capability for complex questions by planning the solution steps firstly and then solving them step by step, but it suffers the difficulty to represent and execute the problem-solving logic of complex questions. To deal with these challenges, in this work, we propose a novel Plan-and-Solve prompting method based on Question Decomposition Meaning Representation (QDMR). Specifically, this method first allows the LLM to generate a QDMR graph to represent the problem-solving logic, which is a directed acyclic graph composed of sub-questions. Then, the LLM generates a specific solving process based on the QDMR graph. When solving each sub-question, it can locate the preceding sub-questions and their answers according to the QDMR graph, and then utilize this information for solution. Compared with existing Plan-and-Solve prompting techniques, our method can not only represent the problem-solving logic of complicated questions more accurately with the aid of QDMR graph, but also deliver the dependence information accurately for different solution steps according to the QDMR graph. In addition, with the supervised fine-tuning on the Allen Institute dataset, the decomposing capability of LLM for complicated questions can be considerably enhanced. Extensive experiments show that our method has achieve a great significance in arithmetic reasoning and commonsense reasoning task by comparing the classical Chain-of-Thought prompting and Plan-and-Solve prompting techniques, and the improvements achieved are even greater for problems with more reasoning steps.
2021
Deep Cognitive Reasoning Network for Multi-hop Question Answering over Knowledge Graphs
Jianyu Cai | Zhanqiu Zhang | Feng Wu | Jie Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Jianyu Cai | Zhanqiu Zhang | Feng Wu | Jie Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
2020
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
Hao Tian | Can Gao | Xinyan Xiao | Hao Liu | Bolei He | Hua Wu | Haifeng Wang | Feng Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Hao Tian | Can Gao | Xinyan Xiao | Hao Liu | Bolei He | Hua Wu | Haifeng Wang | Feng Wu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Recently, sentiment analysis has seen remarkable advance with the help of pre-training approaches. However, sentiment knowledge, such as sentiment words and aspect-sentiment pairs, is ignored in the process of pre-training, despite the fact that they are widely used in traditional sentiment analysis approaches. In this paper, we introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks. With the help of automatically-mined knowledge, SKEP conducts sentiment masking and constructs three sentiment knowledge prediction objectives, so as to embed sentiment information at the word, polarity and aspect level into pre-trained sentiment representation. In particular, the prediction of aspect-sentiment pairs is converted into multi-label classification, aiming to capture the dependency between words in a pair. Experiments on three kinds of sentiment tasks show that SKEP significantly outperforms strong pre-training baseline, and achieves new state-of-the-art results on most of the test datasets. We release our code at https://github.com/baidu/Senta.