Proceedings of the Workshop on Multilingual Multimodal Learning
Emanuele Bugliarello, Kai-Wei Cheng, Desmond Elliott, Spandana Gella, Aishwarya Kamath, Liunian Harold Li, Fangyu Liu, Jonas Pfeiffer, Edoardo Maria Ponti, Krishna Srinivasan, Ivan Vulić, Yinfei Yang, Da Yin (Editors)
- Anthology ID:
- 2022.mml-1
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland and Online
- Venue:
- MML
- SIG:
- Publisher:
- Association for Computational Linguistics
- URL:
- https://aclanthology.org/2022.mml-1
- DOI:
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2022.mml-1.pdf
Proceedings of the Workshop on Multilingual Multimodal Learning
Emanuele Bugliarello
|
Kai-Wei Cheng
|
Desmond Elliott
|
Spandana Gella
|
Aishwarya Kamath
|
Liunian Harold Li
|
Fangyu Liu
|
Jonas Pfeiffer
|
Edoardo Maria Ponti
|
Krishna Srinivasan
|
Ivan Vulić
|
Yinfei Yang
|
Da Yin
Language-agnostic Semantic Consistent Text-to-Image Generation
SeongJun Jung
|
Woo Suk Choi
|
Seongho Choi
|
Byoung-Tak Zhang
Recent GAN-based text-to-image generation models have advanced that they can generate photo-realistic images matching semantically with descriptions. However, research on multi-lingual text-to-image generation has not been carried out yet much. There are two problems when constructing a multilingual text-to-image generation model: 1) language imbalance issue in text-to-image paired datasets and 2) generating images that have the same meaning but are semantically inconsistent with each other in texts expressed in different languages. To this end, we propose a Language-agnostic Semantic Consistent Generative Adversarial Network (LaSC-GAN) for text-to-image generation, which can generate semantically consistent images via language-agnostic text encoder and Siamese mechanism. Experiments on relatively low-resource language text-image datasets show that the model has comparable generation quality as images generated by high-resource language text, and generates semantically consistent images for texts with the same meaning even in different languages.