Li Hao
Also published as: 浩 李
2025
Uni-Retrieval: A Multi-Style Retrieval Framework for STEM’s Education
Yanhao Jia
|
Xinyi Wu
|
Li Hao
|
QinglinZhang QinglinZhang
|
Yuxiao Hu
|
Shuai Zhao
|
Wenqi Fan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
In AI-facilitated teaching, leveraging various query styles to interpret abstract text descriptions is crucial for ensuring high-quality teaching. However, current retrieval models primarily focus on natural text-image retrieval, making them insufficiently tailored to educational scenarios due to the ambiguities in the retrieval process. In this paper, we propose a diverse expression retrieval task tailored to educational scenarios, supporting retrieval based on multiple query styles and expressions. We introduce the STEM Education Retrieval Dataset (SER), which contains over 24,000 query pairs of different styles, and the Uni-Retrieval, an efficient and style-diversified retrieval vision-language model based on prompt tuning. Uni-Retrieval extracts query style features as prototypes and builds a continuously updated Prompt Bank containing prompt tokens for diverse queries. This bank can updated during test time to represent domain-specific knowledge for different subject retrieval scenarios. Our framework demonstrates scalability and robustness by dynamically retrieving prompt tokens based on prototype similarity, effectively facilitating learning for unknown queries. Experimental results indicate that Uni-Retrieval outperforms existing retrieval models in most retrieval tasks.
2024
面向社交媒体多特征增强的药物不良反应检测(Adverse drug reaction detection with multi-feature enhancement for social media)
Li Hao (李浩)
|
Qiu Yunzhi (邱云志)
|
Lin Hongfei (林鸿飞)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 1: Main Conference)
“社交媒体是药物不良反应(ADR)检测的重要途径之一。本文提出一个基于社交媒体的药物不良反应检测模型DMFE,以全面捕捉患者对药物使用的反馈信息。与传统的文本检测相比,社交媒体数据中通常会有语法不规范与单词拼写错误的问题。本文提取出社交媒体数据的抽象语义表示(AMR)使用图注意力网络(GAT)学习抽象语义特征提高模型对语义信息的理解,使用字符级卷积神经网络(charCNN)捕获字符特征以减少单词拼写错误带来的影响。此外,本文使用提示学习的方法融入荍荥荤荄荒荁药物不良反应领域关键词进一步增强模型对领域知识的理解能力。经实验评估,本文模型DMFE在CADEC、TwiMed两个数据集上F1值与基线模型相比取得最优效果。”
Search
Fix author
Co-authors
- Wenqi Fan 1
- Lin Hongfei (林鸿飞) 1
- Yuxiao Hu 1
- Yanhao Jia 1
- QinglinZhang QinglinZhang 1
- show all...