• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (04): 737-745.

• 人工智能与数据挖掘 • 上一篇    下一篇

改进DeepLabV3+的高效语义分割

马冬梅,李鹏辉,黄欣悦,张倩,杨鑫   

  1. (西北师范大学物理与电子工程学院,甘肃 兰州 730070)
  • 收稿日期:2020-09-25 修回日期:2020-11-02 接受日期:2022-04-25 出版日期:2022-04-25 发布日期:2022-04-20
  • 基金资助:
    国家自然科学基金(61961037)

Efficient semantic segmentation based on improved DeepLabV3+

MA Dong-mei,LI Peng-hui,HUANG Xin-yue,ZHANG Qian,YANG Xin   

  1. (School of Physics & Electronic Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2020-09-25 Revised:2020-11-02 Accepted:2022-04-25 Online:2022-04-25 Published:2022-04-20

摘要: 针对目前高精度的语义分割模型普遍存在计算复杂度高、占用内存大,难以在硬件存储和计算力有限的嵌入式平台部署的问题,从网络的参数量、计算量和性能3个方面综合考虑,提出一种基于改进DeepLabV3+的高效语义分割模型。该模型以MobileNetV2为骨干网络,在空洞空间金字塔池化(ASPP)模块中并联混合带状池化(MSP),以获取密集的上下文信息;在解码部分引入有效通道注意力(ECA)模块,以恢复更清晰的目标边界;将深度可分离卷积应用到ASPP模块和解码器中用于压缩模型。在PASCAL VOC 2012数据集上的实验中,该模型的网络参数量为4.5×106,浮点计算量为11.13 GFLOPs,平均交并比为72.07%,在计算效率和分割精度之间达到了良好的均衡。

关键词: 语义分割, DeepLabV3+, 带状池化, 有效通道注意力, 深度可分离卷积

Abstract: The current high-precision semantic segmentation medel generally have the problems of high computational complexity and large memory usage, so it is difficult to deploy on embedded platforms with limited hardware storage and computing power. Aiming at the problem, an improved efficient semantic segmentation medel  based on improved DeepLabV3+ is proposed by comprehensively considering three aspects of network parameters, calculation and performance. The model uses MobileNetV2 as the backbone network, and combines the mix strip pooling(MSP) in the atrous spatial pyramid pooling (ASPP) module to obtain dense context information. The effective channel attention (ECA) module is introduced in the decoder to restore a clearer target boundary. Depthwise separable convolution is applied to the ASPP module and decoder to compress the model. Experiment on the PASCAL VOC 2012 dataset show that the number of network parameters of the medel  is 4.5×106, the number of floating point operations is 11.13 GFLOPs, and the mean intersection over union is 72.07%, which proves that the algorithm achieves the good balance between calculation efficiency and segmentation accuracy.


Key words: semantic segmentation, DeepLabV3+, strip pooling, efficient channel attention, depthwise separable convolution