Harnessing CLIP for Evidence Identification in Scientific Literature: A Multimodal Approach to Context24 Shared Task

Anukriti Kumar; Lucy Lu Wang

Harnessing CLIP for Evidence Identification in Scientific Literature: A Multimodal Approach to Context24 Shared Task

Abstract

Knowing whether scientific claims are supported by evidence is fundamental to scholarly communication and evidence-based decision-making. We present our approach to Task 1 of the Context24 Shared Task—Contextualizing Scientific Figures and Tables (SDP@ACL2024), which focuses on identifying multimodal evidence from scientific publications that support claims. We finetune CLIP, a state-of-the-art model for image-text similarity tasks, to identify and rank figures and tables in papers that substantiate specific claims. Our methods focus on text and image preprocessing techniques and augmenting the organizer-provided training data with labeled examples from the SciMMIR and MedICaT datasets. Our best-performing model achieved NDCG@5 and NDCG@10 values of 0.26 and 0.30, respectively, on the Context24 test split. Our findings underscore the effectiveness of data augmentation and preprocessing in improving the model’s ability in evidence matching.

Anthology ID:: 2024.sdp-1.29
Volume:: Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Tirthankar Ghosal, Amanpreet Singh, Anita Waard, Philipp Mayr, Aakanksha Naik, Orion Weller, Yoonjoo Lee, Shannon Shen, Yanxia Qin
Venues:: sdp | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 307–313
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.sdp-1.29/
DOI:
Bibkey:
Cite (ACL):: Anukriti Kumar and Lucy Wang. 2024. Harnessing CLIP for Evidence Identification in Scientific Literature: A Multimodal Approach to Context24 Shared Task. In Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024), pages 307–313, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Harnessing CLIP for Evidence Identification in Scientific Literature: A Multimodal Approach to Context24 Shared Task (Kumar & Wang, sdp 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.sdp-1.29.pdf

PDF Cite Search Fix data