When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs

Ammar Khairi, Daniel D’souza, Ye Shen, Julia Kreutzer, Sara Hooker


Abstract
Recent advancements in large language models (LLMs) have shifted focus toward scaling inference-time compute—improving performance without retraining the model. A common approach is to sample multiple outputs in parallel, and select one of these as the final output. While existing work has focused on English and specific domains, we study how to robustly scale inference-time compute in a multilingual, multi-task setting: spanning open-ended generations, math and translation tasks, for open models at 8B and 111B scale, across seven languages. Our findings highlight the need for tailored sampling and selection strategies. We propose novel solutions tailored for this multi-faceted inference scenario, demonstrating notable gains across languages and tasks. Our methods achieve an average +6.8 jump in win-rates for 8B models on m-ArenaHard-v2.0 prompts in non-English languages against proprietary models like Gemini. At larger scale, our 111B model shows a +9.0 improvement with just five samples compared to single-sample decoding. These results emphasize the importance of language- and task-aware approaches to democratize inference-time improvements.
Anthology ID:
2025.emnlp-main.1402
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27547–27571
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1402/
DOI:
Bibkey:
Cite (ACL):
Ammar Khairi, Daniel D’souza, Ye Shen, Julia Kreutzer, and Sara Hooker. 2025. When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 27547–27571, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
When Life Gives You Samples: The Benefits of Scaling up Inference Compute for Multilingual LLMs (Khairi et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1402.pdf
Checklist:
 2025.emnlp-main.1402.checklist.pdf