Siqi Li
2024
Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach
Siqi Li
|
Danni Liu
|
Jan Niehues
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Direct speech translation (ST) models often struggle with rare words. Incorrect translation of these words can have severe consequences, impacting translation quality and user trust. While rare word translation is inherently challenging for neural models due to sparse learning signals, real-world scenarios often allow access to translations of past recordings on similar topics. To leverage these valuable resources, we propose a retrieval-and-demonstration approach to enhance rare word translation accuracy in direct ST models. First, we adapt existing ST models to incorporate retrieved examples for rare word translation, which allows the model to benefit from prepended examples, similar to in-context learning. We then develop a cross-modal (speech-to-speech, speech-to-text, text-to-text) retriever to locate suitable examples. We demonstrate that standard ST models can be effectively adapted to leverage examples for rare word translation, improving rare word translation accuracy over the baseline by 17.6% with gold examples and 8.5% with retrieved examples. Moreover, our speech-to-speech retrieval approach outperforms other modalities and exhibits higher robustness to unseen speakers. Our code is publicly available.
Structured Optimal Brain Pruning for Large Language Models
Jiateng Wei
|
Quan Lu
|
Ning Jiang
|
Siqi Li
|
Jingyang Xiang
|
Jun Chen
|
Yong Liu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The massive parameters and computational demands hinder the widespread application of Large Language Models (LLMs). Network pruning provides a practical solution to this problem. However, existing pruning works for LLMs mainly focus on unstructured pruning or necessitate post-pruning fine-tuning. The former relies on special hardware to accelerate computation, while the latter may need substantial computational resources. In this paper, we introduce a retraining-free structured pruning method called SoBP (Structured Optimal Brain Pruning). It leverages global first-order information to select pruning structures, then refines them with a local greedy approach, and finally adopts module-wise reconstruction to mitigate information loss. We assess the effectiveness of SoBP across 14 models from 3 LLM families on 8 distinct datasets. Experimental results demonstrate that SoBP outperforms current state-of-the-art methods.
Search
Co-authors
- Danni Liu 1
- Jan Niehues 1
- Jiateng Wei 1
- Jingyang Xiang 1
- Jun Chen 1
- show all...