• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (06): 1030-1039.

• 计算机网络与信息安全 • 上一篇    下一篇

一种改进Faster R-CNN的图像篡改检测模型

田秀霞,刘正,刘秋旭,李浩然   

  1. (上海电力大学计算机科学与技术学院,上海 200090)
  • 收稿日期:2021-07-06 修回日期:2022-04-19 接受日期:2023-06-25 出版日期:2023-06-25 发布日期:2023-06-16
  • 基金资助:
    国家自然科学基金(61772327);国网甘肃省电力公司电力科学研究院横向项目(H2019-275);上海市大数据管理系统工程研究中心开放课题(H2020-216)

An image tampering detection model based on improved Faster R-CNN

TIAN Xiu-xia,LIU Zheng,LIU Qiu-xu,LI Hao-ran   

  1. (College of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 200090,China)
  • Received:2021-07-06 Revised:2022-04-19 Accepted:2023-06-25 Online:2023-06-25 Published:2023-06-16

摘要: 随着人工智能的发展,数字图像被广泛应用于各大领域。然而,图像编辑软件的出现导致大量图像受到恶意篡改,严重影响了图像内容的真实性。图像篡改检测的研究不同于通用的目标检测,它需要更加关注图像本身的篡改信息,而这些信息表现形式往往比较微弱,所以检测时需要侧重于学习更丰富的篡改特征。提出一种结合梯度边缘信息和注意力机制的双流Faster R-CNN模型,可以实现不同篡改类型区域的检测定位。双流之一为原色流,利用注意力机制提取图像的表层特征,如亮度对比、篡改边界的视觉差异等。双流之二为梯度流,利用梯度高通滤波器增强真实区域与篡改区域之间的边缘异常特征,使模型更容易发现篡改图像中微弱的篡改痕迹。通过紧凑型双线性池化将原色流和梯度流的特征进行融合。由于公开可用的图像篡改数据集规模较小,基于PASCAL VOC 2012数据集创建了规模为10 000幅的图像篡改检测数据集,用于模型预训练。在COVER、COLUMBIA和CASIA数据集上的检测结果表明,所提模型的检测精度相比当前最好模型的提高了7.1%~9.6%,并在JPEG压缩和图像模糊攻击下表现出了更高的鲁棒性。

关键词: 图像篡改检测, 深度学习, 注意力机制, 紧凑型双线性池化

Abstract: With the development of artificial intelligence, digital images have been widely used in various fields. However, due to the appearance of image editing software, a large number of images have been tampered with maliciously, which seriously affects the authenticity of image content. Different from the general object detection, the study of image tampering detection needs to pay more attention to the tamper information of the image itself, which is often manifested in a weak form. Therefore, the detection model needs to focus on learning more abundant tamper features. This paper proposes a dual-stream Faster R-CNN model that combines gradient edge information and attention mechanism, and the model can realize detection and location of regions with different tampering types. One of the two streams is the color stream, which uses the attention mechanism to extract the surface features of the image, such as brightness contrast, visual difference of tampering with the boundary, etc. The second of the two streams is a gradient stream. A Gradient high-pass filter is used to enhance the anomaly edge features between the real area and the tampered area, making it easier for the model to find faint tampered traces in the tampered image. Finally, the features of color stream and gradient stream are fused by means of compact bilinear pooling. Due to the relatively small size of publicly available image tampering data sets, the Pascal VOC 2012 is used to create an image tampering detection data set which containing 10 010 images for model pre-training. The experimental results on COVER, Columbia, and CASIA data sets show that the model proposed in this paper improves the detection accuracy by 7.1% to 9.6% compared to the latest models, and exhibits higher robustness under JPEG compression and image blur attacks.

Key words: image tampering detection, deep learning, attention mechanism, compact bilinear pooling