FZI-WIM at SemEval-2024 Task 2: Self-Consistent CoT for Complex NLI in Biomedical Domain

Jin Liu, Steffen Thoma


Abstract
This paper describes the inference system of FZI-WIM at the SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials. Our system utilizes the chain of thought (CoT) paradigm to tackle this complex reasoning problem and further improve the CoT performance with self-consistency. Instead of greedy decoding, we sample multiple reasoning chains with the same prompt and make thefinal verification with majority voting. The self-consistent CoT system achieves a baseline F1 score of 0.80 (1st), faithfulness score of 0.90 (3rd), and consistency score of 0.73 (12th). We release the code and data publicly.
Anthology ID:
2024.semeval-1.184
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1269–1279
Language:
URL:
https://aclanthology.org/2024.semeval-1.184
DOI:
Bibkey:
Cite (ACL):
Jin Liu and Steffen Thoma. 2024. FZI-WIM at SemEval-2024 Task 2: Self-Consistent CoT for Complex NLI in Biomedical Domain. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1269–1279, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
FZI-WIM at SemEval-2024 Task 2: Self-Consistent CoT for Complex NLI in Biomedical Domain (Liu & Thoma, SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-checklist/2024.semeval-1.184.pdf
Supplementary material:
 2024.semeval-1.184.SupplementaryMaterial.zip
Supplementary material:
 2024.semeval-1.184.SupplementaryMaterial.txt