Zhang Zhuocheng

2025

pdf bib abs
FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation
Zhang Zhuocheng | Yang Feng | Min Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

Retrieval-Augmented Generation (RAG) plays a pivotal role in modern large language model applications, with numerous existing frameworks offering a wide range of functionalities to facilitate the development of RAG systems.However, we have identified several persistent challenges in these frameworks, including lack of new techniques, difficulties in algorithm reproduction and sharing, and high system overhead.To address these limitations, we introduce **FlexRAG**, an open-source framework specifically designed for research and prototyping.FlexRAG supports text-based, multimodal, and network-based RAG, providing comprehensive lifecycle support alongside efficient asynchronous processing and persistent caching capabilities.By offering a robust and flexible solution, FlexRAG enables researchers to rapidly develop, deploy, and share advanced RAG systems.Our toolkit and resources are available at https://github.com/ictnlp/FlexRAG.

2023

pdf bib abs
Enhancing Neural Machine Translation with Semantic Units
Langlin Huang | Shuhao Gu | Zhang Zhuocheng | Yang Feng
Findings of the Association for Computational Linguistics: EMNLP 2023

Conventional neural machine translation (NMT) models typically use subwords and words as the basic units for model input and comprehension. However, complete words and phrases composed of several tokens are often the fundamental units for expressing semantics, referred to as semantic units. To address this issue, we propose a method Semantic Units for Machine Translation (SU4MT) which models the integral meanings of semantic units within a sentence, and then leverages them to provide a new perspective for understanding the sentence. Specifically, we first propose Word Pair Encoding (WPE), a phrase extraction method to help identify the boundaries of semantic units. Next, we design an Attentive Semantic Fusion (ASF) layer to integrate the semantics of multiple subwords into a single vector: the semantic unit representation. Lastly, the semantic-unit-level sentence representation is concatenated to the token-level one, and they are combined as the input of encoder. Experimental results demonstrate that our method effectively models and leverages semantic-unit-level information and outperforms the strong baselines.

pdf bib abs
Scaling Law for Document Neural Machine Translation
Zhang Zhuocheng | Shuhao Gu | Min Zhang | Yang Feng
Findings of the Association for Computational Linguistics: EMNLP 2023

The scaling laws of language models have played a significant role in advancing large language models. In order to promote the development of document translation, we systematically examine the scaling laws in this field. In this paper, we carry out an in-depth analysis of the influence of three factors on translation quality: model scale, data scale, and sequence length. Our findings reveal that increasing sequence length effectively enhances model performance when model size is limited. However, sequence length cannot be infinitely extended; it must be suitably aligned with the model scale and corpus volume. Further research shows that providing adequate context can effectively enhance the translation quality of a document’s initial portion. Nonetheless, exposure bias remains the primary factor hindering further improvement in translation quality for the latter half of the document.

pdf bib abs
Addressing the Length Bias Challenge in Document-Level Neural Machine Translation
Zhang Zhuocheng | Shuhao Gu | Min Zhang | Yang Feng
Findings of the Association for Computational Linguistics: EMNLP 2023

Document-level neural machine translation (DNMT) has shown promising results by incorporating context information through increased maximum lengths of source and target sentences. However, this approach also introduces a length bias problem, whereby DNMT suffers from significant translation quality degradation when decoding sentences that are much shorter or longer than the maximum sentence length during training, i.e., the length bias problem. To prevent the model from neglecting shorter sentences, we sample the training data to ensure a more uniform distribution across different sentence lengths while progressively increasing the maximum sentence length during training. Additionally, we introduce a length-normalized attention mechanism to aid the model in focusing on target information, mitigating the issue of attention divergence when processing longer sentences. Furthermore, during the decoding stage of DNMT, we propose a sliding decoding strategy that limits the length of target sentences to not exceed the maximum length encountered during training. The experimental results indicate that our method can achieve state-of-the-art results on several open datasets, and further analysis shows that our method can significantly alleviate the length bias problem.

Co-authors

Venues

findings3
acl1

Fix data

Zhang Zhuocheng

Fixing paper assignments

2025

2023

Co-authors

Venues