Youyuan Lin

2026

Generative vision-language models (VLMs) can edit and synthesize images, yet their ability to adapt visual assets across markets remains under-evaluated.We study cross-market image transcreation via movie posters, where localization must preserve a movie’s identity while matching market-specific design preferences and multilingual typography.We introduce the Movie Poster Transcreation Benchmark (MPTc-Bench), a cross-market benchmark of 582 aligned poster examples spanning 34 target markets, and define two task variants: Surface (text-centric localization) and Deep (preference-level style adaptation).We propose a two-stage planner-editor pipeline in which an VLM planner specifies executable edits and an image editor renders them.We evaluate in a triplet setup (source, human target-market poster, model output) using information-preservation checks, LLM-as-a-judge ratings for aesthetics and target-market fit, and objective similarity signals.Across multiple planners and editors, experiments reveal substantial gaps between model outputs and human target-market posters, highlighting open challenges for market-aware generation.MPTc-Bench enables controlled, quantitative progress on cross-market image editing beyond understanding-centric benchmarks.

Co-authors

Venues

Findings1

Fix author