Achieving Stronger Generation via Simple Contrastive Tuning
Zhimeng Wang, Pinzheng Wang, Juntao Li, Yibin Chen, Min Zhang
Abstract
Instruction tuning is widely used to unlock the abilities of Large Language Models (LLMs) in following human instructions, resulting in substantial performance improvements across various downstream tasks.Furthermore, contrastive decoding methods are employed to enhance instruction-tuned models. To further explore the potential of contrastive decoding, we introduce the Contrastive Tuning and Decoding (CTD) framework, which enhances model performance without requiring additional data or significant computational resources.When performing Contrastive Tuning, we optimize a correction model by targeting discrepancies between the original outputs and labels. During Contrastive Decoding, the correction model adjusts the logits of the SFT model using the same input to ensure better adherence to instructions.With the lightweight CTD framework, we refine the behavior of instruction-tuned models, improving their performance on the challenging SUPNATINST dataset with unfamiliar data distributions across various models and prompt formats.- Anthology ID:
- 2024.findings-emnlp.525
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 8986–8999
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.525/
- DOI:
- 10.18653/v1/2024.findings-emnlp.525
- Cite (ACL):
- Zhimeng Wang, Pinzheng Wang, Juntao Li, Yibin Chen, and Min Zhang. 2024. Achieving Stronger Generation via Simple Contrastive Tuning. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 8986–8999, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Achieving Stronger Generation via Simple Contrastive Tuning (Wang et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.findings-emnlp.525.pdf