Saama Technologies at EHRSQL 2024: SQL Generation through Classification Answer Selector by LLM

Mohammed Jabir, Kamal Kanakarajan, Malaikannan Sankarasubbu


Abstract
The EHRSQL task aims to develop a dependable text-to-SQL model for Electronic Health Records (EHR) databases, which are crucial sources of clinical data that store patients’ medical histories in hospitals. Large language models (LLM) have been proven to exhibit state-of-the-art performance for text-to-SQL tasks across various domains. To this end, we have developed a framework, SQL Generation through Classification Answer Selector by LLM (SCAS), which comprises two modules. The CAS module determines the answerability of the question, while the SG model generates the SQL query exclusively for answerable questions. Our system ranked 7th on the leaderboard with a Reliability Score of 53.21 on the official test set.
Anthology ID:
2024.clinicalnlp-1.63
Volume:
Proceedings of the 6th Clinical Natural Language Processing Workshop
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Tristan Naumann, Asma Ben Abacha, Steven Bethard, Kirk Roberts, Danielle Bitterman
Venues:
ClinicalNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
655–671
Language:
URL:
https://aclanthology.org/2024.clinicalnlp-1.63
DOI:
Bibkey:
Cite (ACL):
Mohammed Jabir, Kamal Kanakarajan, and Malaikannan Sankarasubbu. 2024. Saama Technologies at EHRSQL 2024: SQL Generation through Classification Answer Selector by LLM. In Proceedings of the 6th Clinical Natural Language Processing Workshop, pages 655–671, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Saama Technologies at EHRSQL 2024: SQL Generation through Classification Answer Selector by LLM (Jabir et al., ClinicalNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.clinicalnlp-1.63.pdf