TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing

Ran Zmigrod, Zhiqiang Ma, Armineh Nourbakhsh, Sameena Shah


Abstract
Visually Rich Form Understanding (VRFU) poses a complex research problemdue to the documents’ highly structured nature and yet highly variable style and content. Current annotation schemes decompose form understanding and omit key hierarchical structure, making development and evaluation of end-to-end models difficult. In this paper, we propose a novel F1 metric to evaluate form parsers and describe a new content-agnostic, tree-based annotation scheme for VRFU: TreeForm. We provide methods to convert previous annotation schemes into TreeForm structures and evaluate TreeForm predictions using a modified version of the normalized tree-edit distance. We present initial baselines for our end-to-end performance metric and the TreeForm edit distance, averaged over the FUNSD and XFUND datasets, of 61.5 and 26.4 respectively. We hope that TreeForm encourages deeper research in annotating, modeling, and evaluating the complexities of form-like documents.
Anthology ID:
2024.law-1.1
Volume:
Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII)
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Sophie Henning, Manfred Stede
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/2024.law-1.1
DOI:
Bibkey:
Cite (ACL):
Ran Zmigrod, Zhiqiang Ma, Armineh Nourbakhsh, and Sameena Shah. 2024. TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing. In Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII), pages 1–11, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
TreeForm: End-to-end Annotation and Evaluation for Form Document Parsing (Zmigrod et al., LAW-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2024.law-1.1.pdf
Video:
 https://preview.aclanthology.org/naacl24-info/2024.law-1.1.mp4