A text-to-image model based on the two-phase stacked generative confrontation network with spectral normalization

Computer Engineering & Science ›› 2022, Vol. 44 ›› Issue (06): 1083-1089.

• Graphics and Images • Previous Articles Next Articles

A text-to-image model based on the two-phase stacked generative confrontation network with spectral normalization

WANG Xia,XU Hui-ying,ZHU Xin-zhong

（College of Mathematics and Computer Science,Zhejiang Normal University,Jinhua 321004,China）

Received:2021-06-07 Revised:2021-07-13 Accepted:2022-06-25 Online:2022-06-25 Published:2022-06-17

Abstract

Abstract: Generating images from text is a challenge task in machine learning community. Although significant success has been achieved so far, problems such as unstable network training and disappear- ing gradients still exist. In response to the above shortcomings, based on the stacked generative confrontation network model (StackGAN), this paper proposes a text-to-image generation method that combines spectral normalization and perceptual loss function. Firstly, the network model applies spectral normalization to the discriminator, restricts the gradient of each layer of the network to a fixed range, slows down the convergence speed of the discriminator, and hence improves the stability of network training. Secondly, the perceptual loss function is added to the generator network to enhance the consistency between the text content and the generated image. The network model uses Inception scores to evaluate the quality of the generated images. The experimental results show that, compared with the original StackGAN, the network model has better stability and generates clearer images.

Key words: deep learning, generative adversarial network, text-to-image generation, spectral normalization, perceptual loss function

WANG Xia, XU Hui-ying, ZHU Xin-zhong. A text-to-image model based on the two-phase stacked generative confrontation network with spectral normalization[J]. Computer Engineering & Science, 2022, 44(06): 1083-1089.

[1]	WU Yuhong, WANG Jian. Fault diagnosis of analog circuits based on Patches-CNN [J]. Computer Engineering & Science, 2025, 47(01): 35-44.
[2]	XU Chao, RUAN Rongyao, CHEN Yong, . A blockchain-based medical data auditing method [J]. Computer Engineering & Science, 2025, 47(01): 95-106.
[3]	CHEN Xinran, LIU Ning, YAN Zhongmin, LIU Lei, CUI Lizhen. An attention-guided dual-granularity cross-modal medical representation learning framework [J]. Computer Engineering & Science, 2025, 47(01): 150-159.
[4]	LUO Jing, YE Zhi-sheng, YANG Ze-hua, FU Tian-hao, WEI Xiong, WANG Xiao-lin, LUO Ying-wei, . Constructing and analyzing deep learning task dataset for R&D GPU clusters [J]. Computer Engineering & Science, 2024, 46(12): 2128-2137.
[5]	JING Chao, BI Yu-shen. OASIS: An interference-aware online scheduling algorithm for deep learning jobs [J]. Computer Engineering & Science, 2024, 46(12): 2138-2148.
[6]	DUAN Cheng-long, YUAN Jie, CHANG Qian-kun, ZHANG Ning-ning. Inverse reinforcement learning algorithm based on D2GA [J]. Computer Engineering & Science, 2024, 46(11): 2053-2062.
[7]	CHEN Lei, LIANG Zheng-you, SUN Yu, CAI Jun-min. Mobile monocular depth estimation based on multi-scale feature fusion [J]. Computer Engineering & Science, 2024, 46(09): 1616-1524.
[8]	LIU Qiang, LI Mu-chun, WU Xiao-jie, WANG Yu-heng. S-JSMA: A fast JSMA adversarial example generation method with low disturbance redundancy [J]. Computer Engineering & Science, 2024, 46(08): 1395-1402.
[9]	DING Jian-ping, LI Wei-jun, LIU Xue-yang, CHEN Xu. A review of named entity recognition research [J]. Computer Engineering & Science, 2024, 46(07): 1296-1310.
[10]	HU Zhao-hua, WANG Chang-fu, . A small object detection algorithm of remote sensing image based on improved Faster R-CNN [J]. Computer Engineering & Science, 2024, 46(06): 1063-1071.
[11]	TAN Yu-song, WANG Wei, JIAN Song-lei, YI Chao-xiong. Weakly-supervised IDS with abnormal-preserving transformation learning [J]. Computer Engineering & Science, 2024, 46(05): 801-809.
[12]	GAO Shan, LI Shi-jie, CAI Zhi-ping. A survey of Chinese text classification based on deep learning [J]. Computer Engineering & Science, 2024, 46(04): 684-692.
[13]	LUO Yue-tong, LI Chao, ZHOU Bo, ZHANG Yan-kong. An interactive separation method for confusable defects in industrial defect classification [J]. Computer Engineering & Science, 2024, 46(03): 463-470.
[14]	Lv Fu, HAN Xiao-tian, FENG Yong-an, XIANG Liang. A texture image classification method based on adaptive texture feature fusion [J]. Computer Engineering & Science, 2024, 46(03): 488-498.
[15]	JI Xu-rui, WEI De-jian, ZHANG Jun-zhong, ZHANG Shuai, CAO Hui. Research progress on information extraction methods of Chinese electronic medical records [J]. Computer Engineering & Science, 2024, 46(02): 325-337.

A text-to-image model based on the two-phase stacked generative confrontation network with spectral normalization

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles 0

Metrics

Comments