Perspective: Leveraging Domain Knowledge for Tabular Machine Learning in the Medical Domain

Arijana Bohr, Thomas Altstidl, Bjoern Eskofier, Emmanuelle Salin


Abstract
There has been limited exploration of how to effectively integrate domain knowledge into machine learning for medical tabular data.Traditional approaches often rely on non-generalizable processes tailored to specific datasets.In contrast, recent advances in deep learning for language and tabular data are leading the way toward more generalizable and scalable methods of domain knowledge inclusion. In this paper, we first explore the need for domain knowledge in medical tabular data, categorize types of medical domain knowledge, and discuss how each can be leveraged in tabular machine learning. We then outline strategies for integrating this knowledge at various stages of the machine learning pipeline. Finally, building on recent advances in tabular deep learning, we propose future research directions to support the integration of domain knowledge.
Anthology ID:
2025.trl-1.11
Volume:
Proceedings of the 4th Table Representation Learning Workshop
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Shuaichen Chang, Madelon Hulsebos, Qian Liu, Wenhu Chen, Huan Sun
Venues:
TRL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
143–155
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.trl-1.11/
DOI:
Bibkey:
Cite (ACL):
Arijana Bohr, Thomas Altstidl, Bjoern Eskofier, and Emmanuelle Salin. 2025. Perspective: Leveraging Domain Knowledge for Tabular Machine Learning in the Medical Domain. In Proceedings of the 4th Table Representation Learning Workshop, pages 143–155, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Perspective: Leveraging Domain Knowledge for Tabular Machine Learning in the Medical Domain (Bohr et al., TRL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.trl-1.11.pdf