Abstract
Phrase grounding (PG) is a multimodal task that grounds language in images. PG systems are evaluated on well-known benchmarks, using Intersection over Union (IoU) as evaluation metric. This work highlights a disconcerting bias in the evaluation of grounded plural phrases, which arises from representing sets of objects as a union box covering all component bounding boxes, in conjunction with the IoU metric. We detect, analyze and quantify an evaluation bias in the grounding of plural phrases and define a novel metric, c-IoU, based on a union box’s component boxes. We experimentally show that our new metric greatly alleviates this bias and recommend using it for fairer evaluation of plural phrases in PG tasks.- Anthology ID:
- 2021.alvr-1.4
- Volume:
- Proceedings of the Second Workshop on Advances in Language and Vision Research
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Venue:
- ALVR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 22–28
- Language:
- URL:
- https://aclanthology.org/2021.alvr-1.4
- DOI:
- 10.18653/v1/2021.alvr-1.4
- Cite (ACL):
- Julia Suter, Letitia Parcalabescu, and Anette Frank. 2021. Grounding Plural Phrases: Countering Evaluation Biases by Individuation. In Proceedings of the Second Workshop on Advances in Language and Vision Research, pages 22–28, Online. Association for Computational Linguistics.
- Cite (Informal):
- Grounding Plural Phrases: Countering Evaluation Biases by Individuation (Suter et al., ALVR 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.alvr-1.4.pdf