• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (11): 2027-2034.

• 图形与图像 • 上一篇    下一篇

基于可学习图像滤波器的雾天驾驶场景图像语义分割

徐欣,李若诗,袁野,刘娜   

  1. (上海理工大学机器智能研究院,上海 200093)

  • 收稿日期:2023-08-12 修回日期:2023-12-19 接受日期:2024-11-25 出版日期:2024-11-25 发布日期:2024-11-27
  • 基金资助:
    国家自然科学基金(92048205)

Semantic segmentation of foggy driving scenes based on learnable image filter

XU Xin,LI Ruo-shi,YUAN Ye,LIU Na   

  1. (Institute of Machine Intelligence,University of Shanghai for Science and Technology,Shanghai 200093,China)
  • Received:2023-08-12 Revised:2023-12-19 Accepted:2024-11-25 Online:2024-11-25 Published:2024-11-27

摘要: 尽管基于深度学习的图像语义分割方法在传统的驾驶数据集上取得了很好的效果,但针对雾天条件下的低质量图像的语义分割仍然具有挑战性。针对此问题,提出了可学习图像滤波器LIF模块,旨在利用不同雾浓度下驾驶场景图像的内在特征,改进雾天驾驶条件下的图像语义分割。LIF模块由超参数预测模块HPM和图像滤波模块IFM组成,IFM中滤波器的超参数由HPM预测得到。以端到端的方式联合学习HPM和语义分割网络,确保了HPM可以学习适当的IFM参数,以弱监督的方式增强图像以进行分割。分别以DeepLabV3+、PSPNet和RefineNet作为基线模型,并在Cityscapes和Foggy Cityscapes的混合数据集上进行实验,基线模型加可学习的图像滤波器模块的MIoU分别为63.14%,60.45%和61.41%,相比基线模型的MIoU分别提升了3.03%,1.52%和1.69%,实验结果表明了该模型的有效性与通用性。

关键词: 雾天图像, 图像语义分割, 图像滤波器, 卷积神经网络, 图像处理

Abstract: Although deep learning-based semantic segmentation methods have achieved excellent results on traditional driving datasets, low-quality images captured under foggy conditions remain challenging. To address this issue, this paper proposes a learnable image filter (LIF) module, aiming to leverage the intrinsic characteristics of driving scene images under varying fog densities to improve semantic segmentation in foggy driving conditions. The LIF module consists of a hyperparameter prediction module (HPM) and an image filtering module (IFM), where the hyperparameters of the filter in the IFM are predicted by the HPM. This paper jointly learns the HPM and the semantic segmentation network in an end-to-end manner, ensuring that the HPM can learn appropriate IFM parameters to enhance images for segmentation in a weakly supervised manner. Taking DeepLabV3+, PSPNet, and RefineNet as baselines, respectively, experiments were conducted on a mixed dataset of Cityscapes and Foggy Cityscapes. The mean intersection over union (MIoU) scores of the baselines with the learnable image filter module are 63.14%, 60.45%, and 61.41%, representing improvements of 3.03%, 1.52%, and 1.69% over the baselines, respectively. The experimental results demonstrate the effectiveness and generality of the proposed module.

Key words: foggy image;image semantic segmentation, image filter, convolutional neural network, image processing