Baixuan Xu


2023

pdf bib
KnowComp at SemEval-2023 Task 7: Fine-tuning Pre-trained Language Models for Clinical Trial Entailment Identification
Weiqi Wang | Baixuan Xu | Tianqing Fang | Lirong Zhang | Yangqiu Song
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

In this paper, we present our system for the textual entailment identification task as a subtask of the SemEval-2023 Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data.The entailment identification task aims to determine whether a medical statement affirms a valid entailment given a clinical trial premise or forms a contradiction with it.Since the task is inherently a text classification task, we propose a system that performs binary classification given a statement and its associated clinical trial.Our proposed system leverages a human-defined prompt to aggregate the information contained in the statement, section name, and clinical trials.Pre-trained language models are then finetuned on the prompted input sentences to learn to discriminate the inference relation between the statement and clinical trial.To validate our system, we conduct extensive experiments with a wide variety of pre-trained language models.Our best system is built on DeBERTa-v3-large, which achieves an F1 score of 0.764 and secures the fifth rank in the official leaderboard.Further analysis indicates that leveraging our designed prompt is effective, and our model suffers from a low recall.Our code and pre-trained models are available at [https://github.com/HKUST-KnowComp/NLI4CT](https://github.com/HKUST-KnowComp/NLI4CT).

pdf
CAT: A Contextualized Conceptualization and Instantiation Framework for Commonsense Reasoning
Weiqi Wang | Tianqing Fang | Baixuan Xu | Chun Yi Louis Bo | Yangqiu Song | Lei Chen
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Commonsense reasoning, aiming at endowing machines with a human-like ability to make situational presumptions, is extremely challenging to generalize.For someone who barely knows about “meditation,” while is knowledgeable about “singing,” he can still infer that “meditation makes people relaxed” from the existing knowledge that “singing makes people relaxed” by first conceptualizing “singing” as a “relaxing event” and then instantiating that event to “meditation.”This process, known as conceptual induction and deduction, is fundamental to commonsense reasoning while lacking both labeled data and methodologies to enhance commonsense modeling.To fill such a research gap, we propose CAT (Contextualized ConceptuAlization and InsTantiation),a semi-supervised learning framework that integrates event conceptualization and instantiation to conceptualize commonsense knowledge bases at scale.Extensive experiments show that our framework achieves state-of-the-art performances on two conceptualization tasks, and the acquired abstract commonsense knowledge can significantly improve commonsense inference modeling.Our code, data, and fine-tuned models are publicly available at [https://github.com/HKUST-KnowComp/CAT](https://github.com/HKUST-KnowComp/CAT).