Ivan Yamshchikov
2025
Transfer of Structural Knowledge from Synthetic Languages
Mikhail Budnikov
|
Ivan Yamshchikov
Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)
This work explores transfer learning from several synthetic languages to English. We investigate the structure of the embeddings in the fine-tuned models, the information they contain, and the capabilities of the fine-tuned models on simple linguistic tasks. We also introduce a new synthetic language that leads to better transfer to English than the languages used in previous research. Finally, we introduce Tiny-Cloze Benchmark — a new synthetic benchmark for natural language understanding that is more informative for less powerful models. We use Tiny-Cloze Benchmark to evaluate fine-tuned models in several domains demonstrating that fine-tuning on a new synthetic language allows for better performance on a variety of tasks.