SEME at SemEval-2024 Task 2: Comparing Masked and Generative Language Models on Natural Language Inference for Clinical Trials

Mathilde Aguiar; Pierre Zweigenbaum; Nona Naderi

doi:10.18653/v1/2024.semeval-1.143

SEME at SemEval-2024 Task 2: Comparing Masked and Generative Language Models on Natural Language Inference for Clinical Trials

Mathilde Aguiar, Pierre Zweigenbaum, Nona Naderi

Abstract

This paper describes our submission to Task 2 of SemEval-2024: Safe Biomedical Natural Language Inference for Clinical Trials. The Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT) consists of a Textual Entailment (TE) task focused on the evaluation of the consistency and faithfulness of Natural Language Inference (NLI) models applied to Clinical Trial Reports (CTR). We test 2 distinct approaches, one based on finetuning and ensembling Masked Language Models and the other based on prompting Large Language Models using templates, in particular, using Chain-Of-Thought and Contrastive Chain-Of-Thought. Prompting Flan-T5-large in a 2-shot setting leads to our best system that achieves 0.57 F1 score, 0.64 Faithfulness, and 0.56 Consistency.

Anthology ID:: 2024.semeval-1.143
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 986–996
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.semeval-1.143/
DOI:: 10.18653/v1/2024.semeval-1.143
Bibkey:
Cite (ACL):: Mathilde Aguiar, Pierre Zweigenbaum, and Nona Naderi. 2024. SEME at SemEval-2024 Task 2: Comparing Masked and Generative Language Models on Natural Language Inference for Clinical Trials. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 986–996, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: SEME at SemEval-2024 Task 2: Comparing Masked and Generative Language Models on Natural Language Inference for Clinical Trials (Aguiar et al., SemEval 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.semeval-1.143.pdf
Supplementarymaterial:: 2024.semeval-1.143.SupplementaryMaterial.txt
Supplementarymaterial:: 2024.semeval-1.143.SupplementaryMaterial.zip
Video:: https://preview.aclanthology.org/fix-sig-urls/2024.semeval-1.143.mp4

PDF Cite Search Supplementarymaterial Supplementarymaterial Video Fix data