Minha Jhang
2026
A Universal Avoidance Method for Diverse Multi-branch Generation
Kyeongman Park | Minha Jhang | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2026
Kyeongman Park | Minha Jhang | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2026
Modern generative models still lack human-level creativity, particularly in multi-branch diversity. Prior approaches to address this problem often incur heavy computation or strong dependency on model architecture. Therefore, we introduce **UAG**(**U**niversal **A**voidance **G**eneration), a model-agnostic and computationally efficient generation strategy that penalizes similarity among previously generated outputs. Thus, UAG can enhance multi-branch diversity across both diffusion and transformer models, with minimal additional computation. In experiments, our method achieves up to 1.9 times higher diversity, runs 4.4 times faster, and requires only 1/64 of the FLOPs compared to state-of-the-art methods.
Evaluating Visual Narrative Coherence in Story Visualization via Diversified Storylines
Minha Jhang | Kyeongman Park | Hyukhun Koh | Kyomin Jung
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Minha Jhang | Kyeongman Park | Hyukhun Koh | Kyomin Jung
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Story visualization requires generating a coherent sequence of images that collectively form a narrative, yet existing evaluation metrics and datasets often overlook visual continuity and narrative diversity. In this paper, we introduce the Visual Context-Aware Metric for Story Visualization, which uses large vision-language models to jointly assess caption fidelity and inter-image consistency, achieving Spearman’s correlation comparable to human agreement on two benchmarks. Also, to address the shortcomings of narrowly defined datasets with low diversity, we propose a diffusion-augmented evaluation pipeline that blends diverse and controlled narrative elements at adjustable ratios, producing challenging evaluation sets. By combining VCMS with this pipeline, we provide a scalable, human-aligned framework for evaluating story visualization models.
2025
Conditional [MASK] Discrete Diffusion Language Model
Hyukhun Koh | Minha Jhang | Dohyung Kim | Sangmook Lee | Kyomin Jung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Hyukhun Koh | Minha Jhang | Dohyung Kim | Sangmook Lee | Kyomin Jung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Although auto-regressive models excel in natural language processing, they often struggle to generate diverse text and provide limited controllability. Non-auto-regressive methods could be an alternative but often produce degenerate outputs and exhibit shortcomings in conditional generation. To address these challenges, we propose Diffusion-EAGS, a novel framework that integrates conditional masked language models into diffusion language models through the theoretical lens of a conditional Markov Random Field. In doing so, we propose entropy-adaptive Gibbs sampling and entropy-based noise scheduling to counterbalance each model’s shortcomings. Experimental results show that Diffusion-EAGS outperforms baselines and achieves the best quality-diversity tradeoff, demonstrating its effectiveness in non-autoregressive text generation.