• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (05): 855-861.

• Graphics and Images • Previous Articles     Next Articles

Discrimination-enhanced generative adversarial network in text-to-image generation

TAN Hong-chen1,HUANG Shi-hua2,XIAO He-wen3,YU Bing-bing3,LIU Xiu-ping3   

  1. (1.School of Artificial Intelligence and Automation,Beijing University of Technology,Beijing 100124;
    2.Department of Computer Science,The Hong Kong Polytechnic University,Hongkong 999077;
    3.School of Mathematical Sciences,Dalian University of Technology,Dalian 116024,China)
  • Received:2021-11-11 Revised:2022-01-07 Accepted:2022-05-25 Online:2022-05-25 Published:2022-05-24

Abstract: Based on Generative Adversarial Networks (GANs), most current text-to-image generation algorithms focus on designing different attention generation models to improve the characterization and expression of image details. However, they ignore the discriminators perception of key local semantics, so the generation models can easily generate poor image details to “fool” the discriminators. This paper designs a vocabulary-image discriminative attention module in the discriminators to enhance the discriminators ability to perceive and capture key semantics, and drive the generation model to generate high-quality image details. Therefore, a discrimination-enhanced generative adversarial model (DE-GAN) is proposed. The experimental results show that, on the CUB-Bird dataset, DE-GAN achieves 4.70 on the IS index, which is 4.2% higher than the baseline model and achieves high performance.

Key words: text-to-image generation, generative adversarial network, attention mechanism, discrimination model