Xinlin Wang

2026

Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms
Xinlin Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)

Despite the impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder their widespread deployment in real-world applications. Small Language Models (SLMs) with fewer than 10 billion parameters present a promising alternative; however, their inherent limitations in knowledge and reasoning curtail their effectiveness. Existing research primarily focuses on enhancing SLMs through scaling laws or fine-tuning strategies while overlooking the potential of using agent paradigms, such as tool use and multi-agent collaboration, to systematically compensate for the inherent weaknesses of small models. To address this gap, this paper presents the first large-scale, comprehensive study of <10B open-source models under three paradigms: (1) the base model, (2) a single agent equipped with tools, and (3) a routing-based multi-agent system with collaborative capabilities.Our results show that structured agent frameworks (combining step-by-step reasoning and tool use) substantially improve effectiveness over direct prompting, with single-agent systems achieving the best balance between performance and cost. In contrast, routing-based multi-agent setups introduce additional coordination overhead with limited gains under small-model constraints.Our findings highlight the importance of agent-centric design for efficient and trustworthy deployment in resource-constrained settings.

2025

pdf bib abs

Can Large language model analyze financial statements well?
Xinlin Wang | Mats Brorsson
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)

Since GPT-3.5’s release, large language models (LLMs) have made significant advancements, including in financial analysis. However, their effectiveness in financial calculations and predictions is still uncertain. This study examines LLMs’ ability to analyze financial reports, focusing on three questions: their accuracy in calculating financial ratios, the use of these metrics in DuPont analysis and the Z-score model for bankruptcy prediction, and their effectiveness in predicting financial indicators with limited knowledge. We used various methods, including zero-shot and few-shot learning, retrieval-augmented generation (RAG), and fine-tuning, in three advanced LLMs and compared their outputs to ground truth and expert predictions to assess their calculation and predictive abilities. The results highlight both the potential and limitations of LLMs in processing numerical data and performing complex financial analyses.

Co-authors

Mats Brorsson 1

Venues

Fix author