HARE: an entity and relation centric evaluation framework for histopathology reports

Yunsoo Kim, Michal Wen Sheue Ong, Alex Shavick, Honghan Wu, Adam P. Levine


Abstract
Medical domain automated text generation is an active area of research and development; however, evaluating the clinical quality of generated reports remains a challenge, especially in instances where domain-specific metrics are lacking, e.g. histopathology. We propose HARE (Histopathology Automated Report Evaluation), a novel entity and relation centric framework, composed of a benchmark dataset, a named entity recognition (NER) model, a relation extraction (RE) model, and a novel metric, which prioritizes clinically relevant content by aligning critical histopathology entities and relations between reference and generated reports. To develop the HARE benchmark, we annotated 813 de-identified clinical diagnostic histopathology reports and 652 histopathology reports from The Cancer Genome Atlas (TCGA) with domain-specific entities and relations. We fine-tuned GatorTronS, a domain-adapted language model to develop HARE-NER and HARE-RE which achieved the highest overall F1-score (0.915) among the tested models. The proposed HARE metric outperformed traditional metrics including ROUGE and Meteor, as well as radiology metrics such as RadGraph-XL, with the highest correlation and the best regression to expert evaluations (higher than the second best method, GREEN, a large language model based radiology report evaluator, by Pearson r = 0.168, Spearman 𝜌 = 0.161, Kendall 𝜏 = 0.123, R2 = 0.176, RMSE = 0.018). We release HARE, datasets, and the models at https://github.com/knowlab/HARE to foster advancements in histopathology report generation, providing a robust framework for improving the quality of reports.
Anthology ID:
2025.findings-emnlp.490
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9218–9233
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.490/
DOI:
10.18653/v1/2025.findings-emnlp.490
Bibkey:
Cite (ACL):
Yunsoo Kim, Michal Wen Sheue Ong, Alex Shavick, Honghan Wu, and Adam P. Levine. 2025. HARE: an entity and relation centric evaluation framework for histopathology reports. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 9218–9233, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
HARE: an entity and relation centric evaluation framework for histopathology reports (Kim et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.490.pdf
Checklist:
 2025.findings-emnlp.490.checklist.pdf