This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
SharmilaUpadhyaya
Fixing paper assignments
Please select all papers that do not belong to this person.
Indicate below which author they should be assigned to.
The use of software in acquiring, analyzing, and interpreting research data underscores its role as an essential artifact of scientific inquiry.Understanding and tracing the provenance of software in research helps in reproducible and collaborative research works.In this paper, we present an overview of our second iteration of the Software Mention Detection (SOMD) shared task as a part of the Scholarly Document Processing (SDP) workshop, that will be held in conjunction with ACL in 2025. We intend to foster among participants to brainstorm for optimized software mention detection and additional attributes and relation extraction tasks in the provided gold standard benchmark. Our shared task has two phases of challenges. First, the participants focus on implementing a joint framework for NER and RE for the given dataset. At the same time, the second phase includes the out-of-distribution dataset to evaluate the generalizability of the methods proposed in Phase I. The competition (March-April 2025) attracted 18 participants and spanned two months. Four teams have finished the competition and submitted full system descriptions. Participants applied various approaches, including joint and pipeline models, and explored data augmentation with LLM-generated samples.The evaluation was based on a macro-F1 score for both NER and RE, with the average reported as the SOMD-score.The winning teams achieved a SOMD-score of 0.89 in Phase I and 0.63 in Phase II, demonstrating the challenge of generalization.
We describe the system developed by the DFKI-TalkingRobots Team for the CODI-CRAC 2021 Shared-Task on anaphora resolution in dialogue. Our system consists of three subsystems: (1) the Workspace Coreference System (WCS) incrementally clusters mentions using semantic similarity based on embeddings combined with lexical feature heuristics; (2) the Mention-to-Mention (M2M) coreference resolution system pairs same entity mentions; (3) the Discourse Deixis Resolution (DDR) system employs a Siamese Network to detect discourse anaphor-antecedent pairs. WCS achieved F1-score of 55.6% averaged across the evaluation test sets, M2M achieved 57.2% and DDR achieved 21.5%.
We compare our team’s systems to others submitted for the CODI-CRAC 2021 Shared-Task on anaphora resolution in dialogue. We analyse the architectures and performance, report some problematic cases in gold annotations, and suggest possible improvements of the systems, their evaluation, data annotation, and the organization of the shared task.