Image Description Dataset for Language Learners

Kento Tanaka, Taichi Nishimura, Hiroaki Nanjo, Keisuke Shirai, Hirotaka Kameko, Masatake Dantsuji


Abstract
We focus on image description and a corresponding assessment system for language learners. To achieve automatic assessment of image description, we construct a novel dataset, the Language Learner Image Description (LLID) dataset, which consists of images, their descriptions, and assessment annotations. Then, we propose a novel task of automatic error correction for image description, and we develop a baseline model that encodes multimodal information from a learner sentence with an image and accurately decodes a corrected sentence. Our experimental results show that the developed model can revise errors that cannot be revised without an image.
Anthology ID:
2022.lrec-1.735
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
6814–6821
Language:
URL:
https://aclanthology.org/2022.lrec-1.735
DOI:
Bibkey:
Cite (ACL):
Kento Tanaka, Taichi Nishimura, Hiroaki Nanjo, Keisuke Shirai, Hirotaka Kameko, and Masatake Dantsuji. 2022. Image Description Dataset for Language Learners. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6814–6821, Marseille, France. European Language Resources Association.
Cite (Informal):
Image Description Dataset for Language Learners (Tanaka et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-url/2022.lrec-1.735.pdf
Data
COCO