Chinese Vision-Language Understanding Evaluation
Wang Jiangkuo, Zheng Linwei, Chen Kehai, Bai Xuefeng, Zhang Min
Abstract
“This paper introduces our systems submitted for the Chinese Vision-Language Understanding Evaluation task at the 23rd Chinese Computational Linguistics Conference.In this competition, we utilized X2-VLM and CCLM models to participate in various subtasks such as image-text retrieval, visual grounding, visual dialogue, and visual question answering. Additionally, we employed other models to assess performance on certain subtasks. We optimized our models and successfully applied them to these different tasks”- Anthology ID:
- 2024.ccl-3.41
- Volume:
- Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
- Month:
- July
- Year:
- 2024
- Address:
- Taiyuan, China
- Editors:
- Hongfei Lin, Hongye Tan, Bin Li
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 363–373
- Language:
- English
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ccl-3.41/
- DOI:
- Cite (ACL):
- Wang Jiangkuo, Zheng Linwei, Chen Kehai, Bai Xuefeng, and Zhang Min. 2024. Chinese Vision-Language Understanding Evaluation. In Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 363–373, Taiyuan, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Chinese Vision-Language Understanding Evaluation (Jiangkuo et al., CCL 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ccl-3.41.pdf