• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2025, Vol. 47 ›› Issue (9): 1647-1657.

• 图形与图像 • 上一篇    下一篇

基于Transformer的逐像素细节补偿去雾网络

王燕,刘晶晶,胡津源,陈燕燕   

  1. (兰州理工大学计算机与通信学院,甘肃 兰州 730050)
  • 收稿日期:2024-03-07 修回日期:2024-05-14 出版日期:2025-09-25 发布日期:2025-09-22
  • 基金资助:
    国家自然科学基金(62266030)

A Transformer-based pixel-by-pixel detail compensation dehazing network

WANG Yan,LIU Jingjing,HU Jinyuan,CHEN Yanyan   

  1. (School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China)
  • Received:2024-03-07 Revised:2024-05-14 Online:2025-09-25 Published:2025-09-22

摘要: 目前,基于深度学习的图像去雾算法难以同时提取图像的全局特征和局部特征,导致复原后的图像细节信息丢失,存在颜色失真现象。针对这一问题,提出一种基于Transformer的逐像素细节补偿去雾网络,主要由基于Transformer的编码器-解码器和CNN分支构成。输入有雾图像,通过编码器进行全局特征提取,编码器中的Transformer由通道自注意力块CAB、压缩注意力块CANB和双分支自适应块DANB组成,其中CANB通过特征聚合、注意力计算和特征恢复捕获图像超像素全局依赖性,DANB采用双分支结构将超像素全局依赖性封装到单个像素中,得到全局特征信息;同时,CNN分支中的空间注意力能够提高网络对不同雾度的感知能力,进行局部特征提取;最后,在解码器部分将编码器和CNN分支提取到的特征进行融合,输出清晰图像。实验结果表明,提出的网络在合成数据集RESIDE和真实数据集O-HAZE与NH-HAZE上均表现突出,能够有效解决细节特征丢失和颜色失真问题。

关键词: 图像去雾, 深度学习, 双分支特征融合, 细节补偿, Transformer架构

Abstract: Currently, deep learning-based image dehazing algorithms struggle to simultaneously extract the global and local features of images, resulting in the loss of detailed information in the restored images and the occurrence of color distortion. To address this issue, a pixel-wise detail compensation dehazing network based on Transformer is proposed, which mainly consists of a Transformer-based encoder-decoder and a CNN branch. When a foggy image is input, global feature extraction is performed through the encoder. The Transformer in the encoder is composed of a channel attention block (CAB), a compression attention neural  block (CANB), and a dual-branch adaptive neural block (DANB). The CANB captures the global dependencies of image superpixels through feature aggregation, attention calculation, and feature restoration. The DANB adopts a dual-branch structure to encapsulate the global dependencies of superpixels into individual pixels, thereby obtaining global feature information. Meanwhile, the spatial attention in the CNN branch can enhance the model’s ability to perceive different fog densities and perform local feature extraction. Finally, in the decoder part, the features extracted by the encoder and the CNN branch are fused to output a clear image. Experimental results show that the proposed model performs excellently on both synthetic dataset (RESIDE) and real datasets (O-HAZE and NH-HAZE), and can effectively solve the problems of detailed feature loss and color distortion.

Key words: image dehazing, deep learning, dual branch feature fusion, detail compensation, Transformer architecture