Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications

Ethan Lin; Zhiyuan Peng; Yi Fang

doi:10.18653/v1/2025.aisd-main.5

Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications

Abstract

Recent studies have evaluated creativity, where novelty is an important aspect, of large language models (LLMs) primarily from a semantic perspective, using benchmarks from cognitive science. However, assessing the novelty in scholarly publications, a critical facet of evaluating LLMs as scientific discovery assistants, remains underexplored, despite its potential to accelerate research cycles and prioritize high-impact contributions in scientific workflows. We introduce SchNovel, a benchmark to evaluate LLMs’ ability to assess novelty in scholarly papers, a task central to streamlining discovery pipeline. SchNovel consists of 15000 pairs of papers across six fields sampled from the arXiv dataset with publication dates spanning 2 to 10 years apart. In each pair, the more recently published paper is assumed to be more novel. Additionally, we propose RAG-Novelty, a retrieval-augmented method that mirrors human peer review by grounding novelty assessment in retrieved context. Extensive experiments provide insights into the capabilities of different LLMs to assess novelty and demonstrate that RAG-Novelty outperforms recent baseline models highlight LLMs’ promise as tools for automating novelty detection in scientific workflows.

Anthology ID:: 2025.aisd-main.5
Volume:: Proceedings of the 1st Workshop on AI and Scientific Discovery: Directions and Opportunities
Month:: May
Year:: 2025
Address:: Albuquerque, New Mexico, USA
Editors:: Peter Jansen, Bhavana Dalvi Mishra, Harsh Trivedi, Bodhisattwa Prasad Majumder, Tom Hope, Tushar Khot, Doug Downey, Eric Horvitz
Venues:: AISD | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 46–57
Language:
URL:: https://preview.aclanthology.org/moar-dois/2025.aisd-main.5/
DOI:: 10.18653/v1/2025.aisd-main.5
Bibkey:
Cite (ACL):: Ethan Lin, Zhiyuan Peng, and Yi Fang. 2025. Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications. In Proceedings of the 1st Workshop on AI and Scientific Discovery: Directions and Opportunities, pages 46–57, Albuquerque, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):: Evaluating and Enhancing Large Language Models for Novelty Assessment in Scholarly Publications (Lin et al., AISD 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/moar-dois/2025.aisd-main.5.pdf

PDF Cite Search Fix data