Jingchang Chen


2024

pdf
SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models
Zekun Wang | Jingchang Chen | Wangchunshu Zhou | Haichao Zhu | Jiafeng Liang | Liping Shan | Ming Liu | Dongliang Xu | Qing Yang | Bing Qin
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications. Moreover, the degree of redundancy in token representations and model parameters, such as attention heads, varies significantly for different inputs. In light of the challenges, we propose SmartTrim, an adaptive acceleration framework for VLMs, which adjusts the computational overhead per instance. Specifically, we integrate lightweight modules into the original backbone to identify and prune redundant token representations and attention heads within each layer. Furthermore, we devise a self-distillation strategy to enhance the consistency between the predictions of the pruned model and its fully-capacity counterpart. Experimental results across various vision-language tasks consistently demonstrate that SmartTrim accelerates the original model by 2-3 times with minimal performance degradation, highlighting the effectiveness and efficiency compared to previous approaches. Code will be available at https://github.com/kugwzk/SmartTrim.

pdf
An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation
Kun Zhu | Xiaocheng Feng | Xiyuan Du | Yuxuan Gu | Weijiang Yu | Haotian Wang | Qianglong Chen | Zheng Chu | Jingchang Chen | Bing Qin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Retrieval-augmented generation integrates the capabilities of large language models with relevant information retrieved from an extensive corpus, yet encounters challenges when confronted with real-world noisy data. One recent solution is to train a filter module to find relevant content but only achieve suboptimal noise compression. In this paper, we propose to introduce the information bottleneck theory into retrieval-augmented generation. Our approach involves the filtration of noise by simultaneously maximizing the mutual information between compression and ground output, while minimizing the mutual information between compression and retrieved passage. In addition, we derive the formula of information bottleneck to facilitate its application in novel comprehensive evaluations, the selection of supervised fine-tuning data, and the construction of reinforcement learning rewards. Experimental results demonstrate that our approach achieves significant improvements across various question answering datasets, not only in terms of the correctness of answer generation but also in the conciseness with 2.5% compression rate.

pdf
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu | Jingchang Chen | Qianglong Chen | Weijiang Yu | Tao He | Haotian Wang | Weihua Peng | Ming Liu | Bing Qin | Ting Liu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Reasoning, a fundamental cognitive process integral to human intelligence, has garnered substantial interest within artificial intelligence.Notably, recent studies have revealed that chain-of-thought prompting significantly enhances LLM’s reasoning capabilities, which attracts widespread attention from both academics and industry.In this paper, we systematically investigate relevant research, summarizing advanced methods through a meticulous taxonomy that offers novel perspectives.Moreover, we delve into the current frontiers and delineate the challenges and future directions, thereby shedding light on future research.Furthermore, we engage in a discussion about open questions.We hope this paper serves as an introduction for beginners and fosters future research.Resources have been made publicly available at https://github.com/zchuz/CoT-Reasoning-Survey

pdf
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
Zheng Chu | Jingchang Chen | Qianglong Chen | Weijiang Yu | Haotian Wang | Ming Liu | Bing Qin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Grasping the concept of time is a fundamental facet of human cognition, indispensable for truly comprehending the intricacies of the world.Previous studies typically focus on specific aspects of time, lacking a comprehensive temporal reasoning benchmark.To address this, we propose TimeBench, a comprehensive hierarchical temporal reasoning benchmark that covers a broad spectrum of temporal reasoning phenomena.TimeBench provides a thorough evaluation for investigating the temporal reasoning capabilities of large language models.We conduct extensive experiments on GPT-4, LLaMA2, and other popular LLMs under various settings.Our experimental results indicate a significant performance gap between the state-of-the-art LLMs and humans, highlighting that there is still a considerable distance to cover in temporal reasoning.Besides, LLMs exhibit capability discrepancies across different reasoning categories.Furthermore, we thoroughly analyze the impact of multiple aspects on temporal reasoning and emphasize the associated challenges.We aspire for TimeBench to serve as a comprehensive benchmark, fostering research in temporal reasoning.Code and data are available at https://github.com/zchuz/TimeBench.

pdf
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering
Zheng Chu | Jingchang Chen | Qianglong Chen | Haotian Wang | Kun Zhu | Xiyuan Du | Weijiang Yu | Ming Liu | Bing Qin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large language models (LLMs) have demonstrated strong reasoning capabilities.Nevertheless, they still suffer from factual errors when tackling knowledge-intensive tasks.Retrieval-augmented reasoning represents a promising approach.However, significant challenges still persist, including inaccurate and insufficient retrieval for complex questions, as well as difficulty in integrating multi-source knowledge.To address this, we propose Beam Aggregation Reasoning (BeamAggR), a reasoning framework for knowledge-intensive multi-hop QA.BeamAggR explores and prioritizes promising answers at each hop of question.Concretely, we parse the complex questions into trees, which include atom and composite questions, followed by bottom-up reasoning.For atomic questions, the LLM conducts reasoning on multi-source knowledge to get answer candidates.For composite questions, the LLM combines beam candidates, explores multiple reasoning paths through probabilistic aggregation, and prioritizes the most promising trajectory.Extensive experiments on four open-domain multi-hop reasoning datasets show that our method significantly outperforms SOTA methods by 8.5%.Furthermore, our analysis reveals that BeamAggR elicits better knowledge collaboration and answer aggregation.