BpHigh at SemEval-2023 Task 7: Can Fine-tuned Cross-encoders Outperform GPT-3.5 in NLI Tasks on Clinical Trial Data?

Bhavish Pahwa, Bhavika Pahwa


Abstract
Many nations and organizations have begun collecting and storing clinical trial records for storage and analytical purposes so that medical and clinical practitioners can refer to them on a centralized database over the internet and stay updated with the current clinical information. The amount of clinical trial records have gone through the roof, making it difficult for many medical and clinical practitioners to stay updated with the latest information. To help and support medical and clinical practitioners, there is a need to build intelligent systems that can update them with the latest information in a byte-sized condensed format and, at the same time, leverage their understanding capabilities to help them make decisions. This paper describes our contribution to SemEval 2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT). Our results show that there is still a need to build domain-specific models as smaller transformer-based models can be finetuned on that data and outperform foundational large language models like GPT-3.5. We also demonstrate how the performance of GPT-3.5 can be increased using few-shot prompting by leveraging the semantic similarity of the text samples and the few-shot train snippets. We will also release our code and our models on open source hosting platforms, GitHub and HuggingFace.
Anthology ID:
2023.semeval-1.266
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1936–1944
Language:
URL:
https://aclanthology.org/2023.semeval-1.266
DOI:
10.18653/v1/2023.semeval-1.266
Bibkey:
Cite (ACL):
Bhavish Pahwa and Bhavika Pahwa. 2023. BpHigh at SemEval-2023 Task 7: Can Fine-tuned Cross-encoders Outperform GPT-3.5 in NLI Tasks on Clinical Trial Data?. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 1936–1944, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
BpHigh at SemEval-2023 Task 7: Can Fine-tuned Cross-encoders Outperform GPT-3.5 in NLI Tasks on Clinical Trial Data? (Pahwa & Pahwa, SemEval 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2023.semeval-1.266.pdf
Video:
 https://preview.aclanthology.org/naacl24-info/2023.semeval-1.266.mp4