Hanna Hubarava
2025
PreClinIE: An Annotated Corpus for Information Extraction in Preclinical Studies
Simona Doneva
|
Hanna Hubarava
|
Pia Härvelid
|
Wolfgang Zürrer
|
Julia Bugajska
|
Bernard Hild
|
David Brüschweiler
|
Gerold Schneider
|
Tilia Ellendorff
|
Benjamin Ineichen
ACL 2025
Animal research, sometimes referred to as preclinical research, plays a vital role in bridging the gap between basic science and clinical applications. However, the rapid increase in publications and the complexity of reported findings make it increasingly difficult for researchers to extract and assess relevant information. While automation through natural language processing (NLP) holds great potential for addressing this challenge, progress is hindered by the absence of high-quality, comprehensive annotated resources specific to preclinical studies. To fill this gap, we introduce PreClinIE, a fully open manually annotated dataset. The corpus consists of abstracts and methods sections from 725 publications, annotated for study rigor indicators (e.g., random allocation) and other study characteristics (e.g., species). We describe the data collection and annotation process, outlining the challenges of working with preclinical literature. By providing this resource, we aim to accelerate the development of NLP tools that enhance literature mining in preclinical research.
2024
MeHuBe at SwissText 2024 Shared Task 1: Ensembling and QLoRA with Retrieved Citations for Fine-Grained Classification of Sustainable Development Goals
Fernando de Meer Pardo
|
Hanna Hubarava
|
Vera Bernhard
Proceedings of the 9th edition of the Swiss Text Analytics Conference
Search
Fix author
Co-authors
- Vera Bernhard 1
- David Brüschweiler 1
- Julia Bugajska 1
- Simona Doneva 1
- Tilia Ellendorff 1
- show all...