Drahomira Herrmannova

2021

pdf bib
A Mixed-Method Design Approach for Empirically Based Selection of Unbiased Data Annotators
Gautam Thakur | Janna Caspersen | Drahomira Herrmannova | Bryan Eaton | Jordan Burdette
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib abs
Overview of the 2021 SDP 3C Citation Context Classification Shared Task
Suchetha N. Kunnath | David Pride | Drahomira Herrmannova | Petr Knoth
Proceedings of the Second Workshop on Scholarly Document Processing

This paper provides an overview of the 2021 3C Citation Context Classification shared task. The second edition of the shared task was organised as part of the 2nd Workshop on Scholarly Document Processing (SDP 2021). The task is composed of two subtasks: classifying citations based on their (Subtask A) purpose and (Subtask B) influence. As in the previous year, both tasks were hosted on Kaggle and used a portion of the new ACT dataset. A total of 22 teams participated in Subtask A, and 19 teams competed in Subtask B. All the participated systems were ranked based on their achieved macro f-score. The highest scores of 0.26973 and 0.60025 were reported for subtask A and B, respectively.

With the ever-increasing pace of research and high volume of scholarly communication, scholars face a daunting task. Not only must they keep up with the growing literature in their own and related fields, scholars increasingly also need to rebut pseudo-science and disinformation. These needs have motivated an increasing focus on computational methods for enhancing search, summarization, and analysis of scholarly documents. However, the various strands of research on scholarly document processing remain fragmented. To reach out to the broader NLP and AI/ML community, pool distributed efforts in this area, and enable shared access to published research, we held the 2nd Workshop on Scholarly Document Processing (SDP) at NAACL 2021 as a virtual event (https://sdproc.org/2021/). The SDP workshop consisted of a research track, three invited talks, and three Shared Tasks (LongSumm 2021, SCIVER, and 3C). The program was geared towards the application of NLP, information retrieval, and data mining for scholarly documents, with an emphasis on identifying and providing solutions to open challenges.

2020

pdf bib
Proceedings of the 8th International Workshop on Mining Scientific Publications
Petr Knoth | Christopher Stahl | Bikash Gyawali | David Pride | Suchetha N. Kunnath | Drahomira Herrmannova
Proceedings of the 8th International Workshop on Mining Scientific Publications

2018

pdf bib abs
Unsupervised Identification of Study Descriptors in Toxicology Research: An Experimental Study
Drahomira Herrmannova | Steven Young | Robert Patton | Christopher Stahl | Nicole Kleinstreuer | Mary Wolfe
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

Identifying and extracting data elements such as study descriptors in publication full texts is a critical yet manual and labor-intensive step required in a number of tasks. In this paper we address the question of identifying data elements in an unsupervised manner. Specifically, provided a set of criteria describing specific study parameters, such as species, route of administration, and dosing regimen, we develop an unsupervised approach to identify text segments (sentences) relevant to the criteria. A binary classifier trained to identify publications that met the criteria performs better when trained on the candidate sentences than when trained on sentences randomly picked from the text, supporting the intuition that our method is able to accurately identify study descriptors.

pdf bib
Analyzing Citation-Distance Networks for Evaluating Publication Impact
Drahomira Herrmannova | Petr Knoth | Robert Patton
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)