RAP: A Metric for Balancing Repetition and Performance in Open-Source Large Language Models
Donghao Huang, Thanh-Son Nguyen, Fiona Liausvia, Zhaoxia Wang
Abstract
Large Language Models (LLMs) have significantly advanced natural language processing, but content repetition in open-source LLMs remains a critical challenge that adversely affects user experience. The repetition penalty parameter (RPP) aims to mitigate this issue by preventing repeated content generation, but excessive use of RPP can compromise the overall quality. In this paper, we propose Repetition-Aware Performance (RAP), a novel evaluation metric that quantifies and integrates repetition penalty into the assessment of model performance, enabling tuning of RPP. We evaluate our approach using twelve open-source LLMs, ranging from 2 billion to 70 billion parameters, tested on question answering and machine translation tasks across three datasets with varying prompting techniques. Experimental results show that RAP effectively tunes RPP, helping to identify a trade-off value that significantly reduces repetition while minimizing performance loss. Upon acceptance, we will release the code and the dataset of generated text, providing a valuable resource for further research on repetition detection and LLMs evaluation.- Anthology ID:
- 2025.naacl-long.69
- Volume:
- Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1479–1496
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.69/
- DOI:
- Cite (ACL):
- Donghao Huang, Thanh-Son Nguyen, Fiona Liausvia, and Zhaoxia Wang. 2025. RAP: A Metric for Balancing Repetition and Performance in Open-Source Large Language Models. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 1479–1496, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- RAP: A Metric for Balancing Repetition and Performance in Open-Source Large Language Models (Huang et al., NAACL 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.69.pdf