Chinese Vision-Language Understanding Evaluation

Jiangkuo Wang, Linwei Zheng, Kehai Chen, Xuefeng Bai, Min Zhang


Abstract
“This paper introduces our systems submitted for the Chinese Vision-Language Understanding Evaluation task at the 23rd Chinese Computational Linguistics Conference.In this competition, we utilized X2-VLM and CCLM models to participate in various subtasks such as image-text retrieval, visual grounding, visual dialogue, and visual question answering. Additionally, we employed other models to assess performance on certain subtasks. We optimized our models and successfully applied them to these different tasks”
Anthology ID:
2024.ccl-3.41
Volume:
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Month:
July
Year:
2024
Address:
Taiyuan, China
Editors:
Lin Hongfei, Tan Hongye, Li Bin
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
363–373
Language:
English
URL:
https://preview.aclanthology.org/iwcs-25-ingestion/2024.ccl-3.41/
DOI:
Bibkey:
Cite (ACL):
Jiangkuo Wang, Linwei Zheng, Kehai Chen, Xuefeng Bai, and Min Zhang. 2024. Chinese Vision-Language Understanding Evaluation. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 363–373, Taiyuan, China. Chinese Information Processing Society of China.
Cite (Informal):
Chinese Vision-Language Understanding Evaluation (Wang et al., CCL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/iwcs-25-ingestion/2024.ccl-3.41.pdf