TAPASGO: Transfer Learning towards a German-Language Tabular Question Answering Model

Dominik Andreas Kowieski, Michael Hellwig, Thomas Feilhauer


Abstract
Processing tabular data holds significant importance across various domains and applications. This study investigates the performance and limitations of fine-tuned models for tabular data analysis, specifically focusing on using fine-tuning mechanics on an English model towards a potential German model. The validation of the effectiveness of the transfer learning approach compares the performance of the fine-tuned German model and of the original English model on test data from the German training set. A potential shortcut that translates the German test data into English serves for comparison. Results reveal that the fine-tuned model outperforms the original model significantly, demonstrating the effectiveness of transfer learning even for a limited amount of training data. One also observes that the English model can effectively process translated German tabular data, albeit with a slight accuracy drop compared to fine-tuning. The model evaluation extends to real-world data extracted from the sustainability reports of a financial institution. The fine-tuned model proves superior in extracting knowledge from these training-unrelated tables, indicating its potential applicability in practical scenarios. This paper also releases the first manually annotated dataset for German Table Question Answering and the related annotation tool.
Anthology ID:
2024.lrec-main.1354
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
15579–15584
Language:
URL:
https://aclanthology.org/2024.lrec-main.1354
DOI:
Bibkey:
Cite (ACL):
Dominik Andreas Kowieski, Michael Hellwig, and Thomas Feilhauer. 2024. TAPASGO: Transfer Learning towards a German-Language Tabular Question Answering Model. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 15579–15584, Torino, Italia. ELRA and ICCL.
Cite (Informal):
TAPASGO: Transfer Learning towards a German-Language Tabular Question Answering Model (Kowieski et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.lrec-main.1354.pdf