Predicting The Scholarly Impact of Research Papers Using Retrieval-Augmented LLMs

Tamjid Azad, Ibrahim Al Azher, Sagnik Ray Choudhury, Hamed Alhoori


Abstract
Assessing a research paper’s scholarly impact is an important phase in the scientific research process; however, metrics typically take some time after publication to accurately capture the impact. Our study examines how Large Language Models (LLMs) can predict scholarly impact accurately. We utilize Retrieval-Augmented Generation (RAG) to examine the degree to which the LLM performance improves compared to zero-shot prompting. Results show that LLama3-8b with RAG achieved the best overall performance, while Gemma-7b benefited the most from RAG, exhibiting the most significant reduction in Mean Absolute Error (MAE). Our findings suggest that retrieval-augmented LLMs offer a promising approach for early research evaluation. Our code and dataset for this project are publicly available.
Anthology ID:
2025.sdp-1.11
Volume:
Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Tirthankar Ghosal, Philipp Mayr, Amanpreet Singh, Aakanksha Naik, Georg Rehm, Dayne Freitag, Dan Li, Sonja Schimmler, Anita De Waard
Venues:
sdp | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
124–131
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.sdp-1.11/
DOI:
10.18653/v1/2025.sdp-1.11
Bibkey:
Cite (ACL):
Tamjid Azad, Ibrahim Al Azher, Sagnik Ray Choudhury, and Hamed Alhoori. 2025. Predicting The Scholarly Impact of Research Papers Using Retrieval-Augmented LLMs. In Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025), pages 124–131, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Predicting The Scholarly Impact of Research Papers Using Retrieval-Augmented LLMs (Azad et al., sdp 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.sdp-1.11.pdf