• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (08): 1454-1460.

• 图形与图像 • 上一篇    下一篇

基于ENet的轻量级语义分割算法研究

徐世杰,杜煜,鹿鑫,吴思凡   

  1. (北京联合大学智慧城市学院,北京 100101)
  • 收稿日期:2020-05-15 修回日期:2020-08-24 接受日期:2021-08-25 出版日期:2021-08-25 发布日期:2021-08-24
  • 基金资助:
    国家自然科学基金(91420202)

A lightweight semantic segmentation algorithm based on ENet

XU Shi-jie,DU Yu,LU Xin,WU Si-fan   

  1. (Smart City College,Beijing Union University,Beijing 100101,China)
  • Received:2020-05-15 Revised:2020-08-24 Accepted:2021-08-25 Online:2021-08-25 Published:2021-08-24

摘要: 语义分割算法能够对图像进行像素级的分类,广泛应用于无人驾驶、医学图像处理和工业自动化等领域,具有重要研究价值。对语义分割算法的研究集中在提升分割精度、降低参数量和增加推理速度3个方面。经典的轻量语义分割算法ENet使用多层卷积的编解码器和大量的空洞卷积来避免过多的下采样和利用空间信息,虽能保证一定的空间信息完整性与较大的感受野,但存在编解码器臃肿、空间信息传递性差、感受野溢出并造成网格效应等问题。对ENet算法结构进行裁剪,利用注意力机制和金字塔结构的空洞卷积设计了空间信息传递模块,优化算法结构,改善算法感受野,完整传递空间信息,提出了改进的ENet算法

C-ENet+AM+RAM
。在公开数据集Cityscapes和BDD100K上的实验结果表明,新模块能够以更小的参数量与计算量提升原有模型性能,证明了原算法删减部分的冗余性与所设计模块的有效性。

关键词: 语义分割, 轻量级, 实时性, 注意力机制, 感受野, 空洞卷积

Abstract: Semantic segmentation algorithms can classify images at the pixel level, and are widely used in fields such as unmanned driving, medical image processing, and industrial automation, and have important research value. The research of semantic segmentation algorithms focuses on three aspects: improving the accuracy of segmentation, reducing the amount of parameters and increasing the speed of inference. The lightweight semantic segmentation algorithm  ENet uses a multi-layer convolutional codec and a large number of dilated convolutions to avoid excessive downsampling and use of spatial information. Although it retains some spatial information integrity and large receptive field, the codec is bloated, the transmission of spatial information is poor, and the sensory field overflows and causes grid effect. Aiming at the above problems, this paper tailors the ENet algorithm structure, uses the attention mechanism and the pyramid dilated convolution to design spatial information transmission module, optimizes the algorithm structure, improves the algorithm receptive field, and completely transmits the spatial information transmission. The experimental results on public datasets Cityscapes and BDD100K show that the new module can improve the performance of the original algorithm with a smaller amount of parameters and calculations, which proves the redundancy of the original algorithm and the effectiveness of the designed module.

Key words: semantic segmentation, lightweight, real-time, attention mechanism, receptive field, dilated convolution