• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (01): 110-117.

• 图形与图像 • 上一篇    下一篇

基于LDA和卷积神经网络的半监督图像标注方法

王保成1,刘利军1,黄青松1,2


  

  1. (1.昆明理工大学信息工程与自动化学院,云南 昆明 650500; 2.云南省计算机技术应用重点实验室,云南 昆明 650500)

  • 收稿日期:2020-06-29 修回日期:2020-09-20 接受日期:2022-01-25 出版日期:2022-01-25 发布日期:2022-01-13
  • 基金资助:
    国家自然科学基金(81860318,81560296)

A semi supervised image annotation method based on LDA and convolutional neural network

WANG Bao-cheng1,LIU Li-jun1,HUANG Qing-song1,2   

  1. (1.Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500;

    2.Yunnan Key Laboratory of Computer Technology Application,

    Kunming University of Science and Technology,Kunming 650500,China)

  • Received:2020-06-29 Revised:2020-09-20 Accepted:2022-01-25 Online:2022-01-25 Published:2022-01-13

摘要: 随着智能设备的不断出现,图像数量急速增加,但是很多图像因为没有被标注所以未被充分利用。为了能够使该问题得到较好解决,提出了基于LDA和卷积神经网络的半监督图像标注方法。首先把图像训练集中的所有文字信息放入LDA中,生成图像的文字标注词;然后使用卷积神经网络获得图像的高层视觉特征,同时用加入注意力机制和修改损失函数的方法来对卷积神经网络进行优化;接着把LDA生成的标注词和已获得的图像的高层视觉特征进行结合并同时使用半监督学习来完成模型的训练;
最后把标注词间的相关性和使用最终模型预测的结果相结合来完成图像的最终标注。通过在IAPR TC-12 图像数据集上的相关实验对比可知,文中所提方法的标注更精确。

关键词: LDA, 卷积神经网络, 注意力机制, 半监督学习

Abstract: With the continuous emergence of intelligent devices, the number of pictures increases rapidly. However, many images are not fully utilized because they are not labeled. In order to solve this problem, a semi supervised image annotation method based on LDA and convolutional neural network is proposed. Firstly, all text information in the image training set is put into LDA to generate text tagging words. Secondly, the convolutional neural network is used to obtain the high-level visual features of the image, and the convolutional neural network is optimized by adding attention mechanism and modifying loss function. Thirdly, the label words generated by LDA are combined with the high-level visual features of the obtained image, and the semi supervised learning is used to complete the model training. Finally, the correlation between the tagging words and the prediction results using the final model are combined to complete the final tagging of the image. Comparative experiments on the IAPR TC-12 image data set show that the proposed labeling method is more accurate.

Key words: LDA, convolutional neural network, attention mechanism, semi supervised learning