Synthetic Data Fine-Tuning for Effective Team Formation in Enterprises

Guilherme Drummond Lima, Adriano Veloso


Abstract
We evaluate the effectiveness of synthetic data fine-tuning for Semantic Search in a real-world Enterprise Team Formation problem scenario. In this problem, we aim to retrieve the best employee for a given task, given their information regarding abilities, experiences, and other aspects. We evaluate two synthetic data generation strategies: (1) augmenting real-world data with synthetic labels and (2) generating synthetic profiles for employees tailored to specific tasks. To measure the impact of these strategies, we fine-tune a pretrained text embedding model using LoRA and Rank Aggregation techniques. We evaluate the model performance against current SOTA algorithms on a human-curated dataset. Our experiments indicate that training a model that uses a combination of both Synthetic data generation strategies outperforms already established pre-trained models on the Team Formation task, improving the ranking metrics by an average of 30% in comparison to the best-performing pre-trained model.
Anthology ID:
2026.eacl-industry.46
Volume:
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
598–609
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.46/
DOI:
Bibkey:
Cite (ACL):
Guilherme Drummond Lima and Adriano Veloso. 2026. Synthetic Data Fine-Tuning for Effective Team Formation in Enterprises. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 598–609, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Synthetic Data Fine-Tuning for Effective Team Formation in Enterprises (Lima & Veloso, EACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.46.pdf