Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Pinzhen Chen; Shaoxiong Ji; Nikolay Bogoychev; Andrey Kutuzov; Barry Haddow; Kenneth Heafield

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Andrey Kutuzov, Barry Haddow, Kenneth Heafield

Abstract

Foundational large language models (LLMs) can be instruction-tuned to perform open-domain question answering, facilitating applications like chat assistants. While such efforts are often carried out in a single language, we empirically analyze cost-efficient strategies for multilingual scenarios. Our study employs the Alpaca dataset and machine translations of it to form multilingual data, which is then used to tune LLMs through either low-rank adaptation or full-parameter training. Under a controlled computation budget, comparisons show that multilingual tuning is on par or better than tuning a model for each language. Furthermore, multilingual tuning with downsampled data can be as powerful and more robust. Our findings serve as a guide for expanding language support through instruction tuning.

Anthology ID:: 2024.findings-eacl.90
Volume:: Findings of the Association for Computational Linguistics: EACL 2024
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1347–1356
Language:
URL:: https://aclanthology.org/2024.findings-eacl.90
DOI:
Bibkey:
Cite (ACL):: Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, Andrey Kutuzov, Barry Haddow, and Kenneth Heafield. 2024. Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1347–1356, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca (Chen et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/improve-issue-templates/2024.findings-eacl.90.pdf
Video:: https://preview.aclanthology.org/improve-issue-templates/2024.findings-eacl.90.mp4

PDF Search Video