• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (09): 1625-1634.

• Graphics and Images • Previous Articles     Next Articles

A military image set captioning method based on image and text relevance and context guidance

MEI Yun-hong1,2,LIU Mao-fu1,2   

  1. (1.School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065;
    2.Hubei Province Key Laboratory of Intelligent Information Processing 
    and Real-Time Industrial System,Wuhan 430065,China)
  • Received:2023-03-31 Revised:2023-09-14 Accepted:2024-09-25 Online:2024-09-25 Published:2024-09-23

Abstract: Traditional image captioning methods do not generate explanatory description texts due to the lack of a priori knowledge of the real world, while the accuracy of the generated description texts is not high in some specialized fields. To address these problems, the military news image set captioning task is proposed, and a military news image set dataset is also constructed. The task has two key challenges: the description information is derived from the whole image set and the corresponding news articles; the semantics learned by the model is not sufficient. A military news image set captioning method based on image and text relevance and context guidance (ITRCG) is further proposed. Based on ITRCG, cross-modal information interaction is realized, the model is guided to learn more complete semantics, and named entity generation is assisted by label cleaning. Experimental validation is conducted on the constructed military news image set dataset, and the results show that ITRCG can effectively improve the quality of the description text and achieve improvements in all evaluation metrics.

Key words: image captioning, image and text relevance attention, context guidance attention, image set, news text