RankLLM: A Multi-Criteria Decision-Making Method for LLM Performance Evaluation in Sentiment Analysis

Huzhi Xue; Butian Zhao; Haihua Xie; Zeyu Sun

RankLLM: A Multi-Criteria Decision-Making Method for LLM Performance Evaluation in Sentiment Analysis

Huzhi Xue, Butian Zhao, Haihua Xie, Zeyu Sun

Abstract

"Large Language Models (LLMs) have made significant advancements in sentiment analysis, yet their quality and reliability vary widely. Existing LLM evaluation studies are limited in scope,lack a comprehensive framework for integrating diverse capabilities, and fail to quantify the im-pact of prompt design on performance. To address these gaps, this paper introduces a set of LLM evaluation criteria with detailed explanations and mathematical formulations, aiding users in understanding LLM limitations and selecting the most suitable model for sentiment analysis.Using these criteria, we apply the Technique for Order Preference by Similarity to an Ideal Solu-tion (TOPSIS), a classic decision-making method, to rank the performance of LLMs in sentimentanalysis. We evaluated six popular LLMs on three Twitter datasets covering different topics and analyze the impact of prompt design by assessing model-prompt combinations. Additionally,a validation experiment on a publicly available annotated dataset further confirms our ranking results. Finally, our findings offer valuable insights into the evaluation and selection of LLMs for sentiment analysis."

Anthology ID:: 2025.ccl-1.62
Volume:: Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025)
Month:: August
Year:: 2025
Address:: Jinan, China
Editors:: Maosong Sun, Peiyong Duan, Zhiyuan Liu, Ruifeng Xu, Weiwei Sun
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 818–830
Language:
URL:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.62/
DOI:
Bibkey:
Cite (ACL):: Huzhi Xue, Butian Zhao, Haihua Xie, and Zeyu Sun. 2025. RankLLM: A Multi-Criteria Decision-Making Method for LLM Performance Evaluation in Sentiment Analysis. In Proceedings of the 24th China National Conference on Computational Linguistics (CCL 2025), pages 818–830, Jinan, China. Chinese Information Processing Society of China.
Cite (Informal):: RankLLM: A Multi-Criteria Decision-Making Method for LLM Performance Evaluation in Sentiment Analysis (Xue et al., CCL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-ccl/2025.ccl-1.62.pdf

PDF Cite Search Fix data