Chinese Vision-Language Understanding Evaluation
Jiangkuo Wang, Linwei Zheng, Kehai Chen, Xuefeng Bai, Min Zhang
Abstract
“This paper introduces our systems submitted for the Chinese Vision-Language Understanding Evaluation task at the 23rd Chinese Computational Linguistics Conference.In this competition, we utilized X2-VLM and CCLM models to participate in various subtasks such as image-text retrieval, visual grounding, visual dialogue, and visual question answering. Additionally, we employed other models to assess performance on certain subtasks. We optimized our models and successfully applied them to these different tasks”- Anthology ID:
- 2024.ccl-3.41
- Volume:
- Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
- Month:
- July
- Year:
- 2024
- Address:
- Taiyuan, China
- Editors:
- Lin Hongfei, Tan Hongye, Li Bin
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 363–373
- Language:
- English
- URL:
- https://preview.aclanthology.org/corrections-2025-09/2024.ccl-3.41/
- DOI:
- Cite (ACL):
- Jiangkuo Wang, Linwei Zheng, Kehai Chen, Xuefeng Bai, and Min Zhang. 2024. Chinese Vision-Language Understanding Evaluation. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 363–373, Taiyuan, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Chinese Vision-Language Understanding Evaluation (Wang et al., CCL 2024)
- PDF:
- https://preview.aclanthology.org/corrections-2025-09/2024.ccl-3.41.pdf