Hejie Cui


2024

pdf
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu | Hejie Cui | Yue Yu | Xuan Kan | Wenqi Shi | Yuchen Zhuang | May Dongmei Wang | Wei Jin | Joyce Ho | Carl Yang
Findings of the Association for Computational Linguistics: ACL 2024

Clinical natural language processing faces challenges like complex medical terminology and clinical contexts. Recently, large language models (LLMs) have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation with LLMs for clinical NLP tasks. We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domain-specific knowledge graphs and LLMs to guide data generation. Our extensive empirical study across 8 clinical NLP tasks and 18 datasets reveals that ClinGen consistently enhances performance across various tasks by 7.7%-8.7% on average, effectively aligning the distribution of real datasets and enriching the diversity of generated training instances.

2023

pdf
PV2TEA: Patching Visual Modality to Textual-Established Information Extraction
Hejie Cui | Rongmei Lin | Nasser Zalmout | Chenwei Zhang | Jingbo Shang | Carl Yang | Xian Li
Findings of the Association for Computational Linguistics: ACL 2023

Information extraction, e.g., attribute value extraction, has been extensively studied and formulated based only on text. However, many attributes can benefit from image-based extraction, like color, shape, pattern, among others. The visual modality has long been underutilized, mainly due to multimodal annotation difficulty. In this paper, we aim to patch the visual modality to the textual-established attribute in- formation extractor. The cross-modality integration faces several unique challenges: (C1) images and textual descriptions are loosely paired intra-sample and inter-samples; (C2) images usually contain rich backgrounds that can mislead the prediction; (C3) weakly supervised labels from textual-established ex- tractors are biased for multimodal training. We present PV2TEA, an encoder-decoder architecture equipped with three bias reduction schemes: (S1) Augmented label-smoothed contrast to improve the cross-modality alignment for loosely-paired image and text; (S2) Attention-pruning that adaptively distinguishes the visual foreground; (S3) Two-level neighborhood regularization that mitigates the label textual bias via reliability estimation. Empirical results on real-world e-Commerce datasets1 demonstrate up to 11.74% absolute (20.97% relatively) F1 increase over unimodal baselines.