Abstract
While large language models (LLMs) exhibit impressive linguistic abilities, their numerical reasoning skills within real-world contexts re- main under-explored. This paper describes our participation in a headline-generation challenge by Numeval at Semeval 2024, which focused on numerical reasoning. Our system achieved an overall top numerical accuracy of 73.49% on the task. We explore the system’s design choices contributing to this result and analyze common error patterns. Our findings highlight the potential and ongoing challenges of integrat- ing numerical reasoning within large language model-based headline generation.- Anthology ID:
- 2024.semeval-1.103
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 716–720
- Language:
- URL:
- https://aclanthology.org/2024.semeval-1.103
- DOI:
- Cite (ACL):
- Pawan Rajpoot and Nut Chukamphaeng. 2024. Team NP_PROBLEM at SemEval-2024 Task 7: Numerical Reasoning in Headline Generation with Preference Optimization. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 716–720, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Team NP_PROBLEM at SemEval-2024 Task 7: Numerical Reasoning in Headline Generation with Preference Optimization (Rajpoot & Chukamphaeng, SemEval 2024)
- PDF:
- https://preview.aclanthology.org/fix-volume-bibkeys/2024.semeval-1.103.pdf