Huiqun Yu
2025
VenusFactory: An Integrated System for Protein Engineering with Data Retrieval and Language Model Fine-Tuning
Yang Tan
|
Chen Liu
|
Jingyuan Gao
|
Wu Banghao
|
Mingchen Li
|
Ruilin Wang
|
Lingrong Zhang
|
Huiqun Yu
|
Guisheng Fan
|
Liang Hong
|
Bingxin Zhou
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Natural language processing (NLP) has significantly influenced scientific domains beyond human language, including protein engineering, where pre-trained protein language models (PLMs) have demonstrated remarkable success. However, interdisciplinary adoption remains limited due to challenges in data collection, task benchmarking, and application. This work presents VenusFactory, a versatile engine that integrates biological data retrieval, standardized task benchmarking, and modular fine-tuning of PLMs. VenusFactory supports both computer science and biology communities with choices of both a command-line execution and a Gradio-based no-code interface, integrating 40+ protein-related datasets and 40+ popular PLMs. All implementations are open-sourced on https://github.com/ai4protein/VenusFactory. A video introduction is available at https://www.youtube.com/watch?v=MT6lPH5kgCc.
Search
Fix author
Co-authors
- Wu Banghao 1
- Guisheng Fan 1
- Jingyuan Gao 1
- Liang Hong 1
- Mingchen Li 1
- show all...
Venues
- acl1