AutoTrain: No-code training for state-of-the-art models

Abhishek Thakur

doi:10.18653/v1/2024.emnlp-demo.44

AutoTrain: No-code training for state-of-the-art models

Abstract

With the advancements in open-source models, training(or finetuning) models on custom datasets has become a crucial part of developing solutions which are tailored to specific industrial or open-source applications. Yet, there is no single tool which simplifies the process of training across different types of modalities or tasks.We introduce AutoTrain(aka AutoTrain Advanced)—an open-source, no code tool/library which can be used to train (or finetune) models for different kinds of tasks such as: large language model (LLM) finetuning, text classification/regression, token classification, sequence-to-sequence task, finetuning of sentence transformers, visual language model (VLM) finetuning, image classification/regression and even classification and regression tasks on tabular data. AutoTrain Advanced is an open-source library providing best practices for training models on custom datasets. The library is available at https://github.com/huggingface/autotrain-advanced. AutoTrain can be used in fully local mode or on cloud machines and works with tens of thousands of models shared on Hugging Face Hub and their variations.

Anthology ID:: 2024.emnlp-demo.44
Volume:: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Delia Irazu Hernandez Farias, Tom Hope, Manling Li
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 419–423
Language:
URL:: https://preview.aclanthology.org/ingest_wac_2008/2024.emnlp-demo.44/
DOI:: 10.18653/v1/2024.emnlp-demo.44
Bibkey:
Cite (ACL):: Abhishek Thakur. 2024. AutoTrain: No-code training for state-of-the-art models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 419–423, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: AutoTrain: No-code training for state-of-the-art models (Thakur, EMNLP 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest_wac_2008/2024.emnlp-demo.44.pdf

PDF Cite Search Fix data