Chinese Vision-Language Understanding Evaluation

Wang Jiangkuo; Zheng Linwei; Chen Kehai (陈科海); Bai Xuefeng (白雪峰); Zhang Min (张民)

Chinese Vision-Language Understanding Evaluation

Wang Jiangkuo, Zheng Linwei, Chen Kehai, Bai Xuefeng, Zhang Min

Abstract

“This paper introduces our systems submitted for the Chinese Vision-Language Understanding Evaluation task at the 23rd Chinese Computational Linguistics Conference.In this competition, we utilized X2-VLM and CCLM models to participate in various subtasks such as image-text retrieval, visual grounding, visual dialogue, and visual question answering. Additionally, we employed other models to assess performance on certain subtasks. We optimized our models and successfully applied them to these different tasks”

Anthology ID:: 2024.ccl-3.41
Volume:: Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Month:: July
Year:: 2024
Address:: Taiyuan, China
Editors:: Hongfei Lin, Hongye Tan, Bin Li
Venue:: CCL
SIG:
Publisher:: Chinese Information Processing Society of China
Note:
Pages:: 363–373
Language:: English
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ccl-3.41/
DOI:
Bibkey:
Cite (ACL):: Wang Jiangkuo, Zheng Linwei, Chen Kehai, Bai Xuefeng, and Zhang Min. 2024. Chinese Vision-Language Understanding Evaluation. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 363–373, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):: Chinese Vision-Language Understanding Evaluation (Jiangkuo et al., CCL 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ccl-3.41.pdf

PDF Cite Search Fix data