Automotive Document Labeling Using Large Language Models
Dang Van Thin, Cuong Xuan Chu, Christian Graf, Tobias Kaminski, Trung-Kien Tran
Abstract
Repairing and maintaining car parts are crucial tasks in the automotive industry, requiring a mechanic to have all relevant technical documents available. However, retrieving the right documents from a huge database heavily depends on domain expertise and is time consuming and error-prone. By labeling available documents according to the components they relate to, concise and accurate information can be retrieved efficiently. However, this is a challenging task as the relevance of a document to a particular component strongly depends on the context and the expertise of the domain specialist. Moreover, component terminology varies widely between different manufacturers. We address these challenges by utilizing Large Language Models (LLMs) to enrich and unify a component database via web mining, extracting relevant keywords, and leveraging hybrid search and LLM-based re-ranking to select the most relevant component for a document. We systematically evaluate our method using various LLMs on an expert-annotated dataset and demonstrate that it outperforms the baselines, which rely solely on LLM prompting.- Anthology ID:
- 2025.emnlp-industry.112
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou (China)
- Editors:
- Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1588–1595
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.112/
- DOI:
- Cite (ACL):
- Dang Van Thin, Cuong Xuan Chu, Christian Graf, Tobias Kaminski, and Trung-Kien Tran. 2025. Automotive Document Labeling Using Large Language Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1588–1595, Suzhou (China). Association for Computational Linguistics.
- Cite (Informal):
- Automotive Document Labeling Using Large Language Models (Van Thin et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.112.pdf