Keeping an Eye on Context: Attention Allocation over Input Partitions in Referring Expression Generation

Simeon Schüz, Sina Zarrieß


Abstract
In Referring Expression Generation, model inputs are often composed of different representations, including the visual properties of the intended referent, its relative position and size, and the visual context. Yet, the extent to which this information influences the generation process of black-box neural models is largely unclear. We investigate the relative weighting of target, location, and context information in the attention components of a Transformer-based generation model. Our results show a general target bias, which, however, depends on the content of the generated expressions, pointing to interesting directions for future research.
Anthology ID:
2023.mmnlg-1.3
Volume:
Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023)
Month:
September
Year:
2023
Address:
Prague, Czech Republic
Editors:
Albert Gatt, Claire Gardent, Liam Cripwell, Anya Belz, Claudia Borg, Aykut Erdem, Erkut Erdem
Venues:
MMNLG | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20–27
Language:
URL:
https://aclanthology.org/2023.mmnlg-1.3
DOI:
Bibkey:
Cite (ACL):
Simeon Schüz and Sina Zarrieß. 2023. Keeping an Eye on Context: Attention Allocation over Input Partitions in Referring Expression Generation. In Proceedings of the Workshop on Multimodal, Multilingual Natural Language Generation and Multilingual WebNLG Challenge (MM-NLG 2023), pages 20–27, Prague, Czech Republic. Association for Computational Linguistics.
Cite (Informal):
Keeping an Eye on Context: Attention Allocation over Input Partitions in Referring Expression Generation (Schüz & Zarrieß, MMNLG-WS 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2023.mmnlg-1.3.pdf