• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (08): 1503-1512.

• Artificial Intelligence and Data Mining • Previous Articles     Next Articles

A lightweight semantic segmentation based on attention mechanism

MA Dong-mei,WANG Peng-yu,GUO Zhi-hao   

  1. (School of Physics & Electronic Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2023-03-24 Revised:2023-09-12 Accepted:2024-08-25 Online:2024-08-25 Published:2024-09-02

Abstract: Semantic segmentation is a computer vision technique that requires extracting focused information from a large number of images and then transforming this information into a clearer and easier- to-understand representation by means of a mask. Researchers are trying to find a balance in order to minimize the size of the model while ensuring its accuracy, which is currently a hot topic in designing lightweight network models. Currently, there are many challenges in image semantic segmentation techniques, such as segmentation discontinuity, incorrect segmentation, and high model complexity. To solve these problems, a lightweight semantic segmentation model based on attention mechanism is proposed. It uses freeze-thaw training, and the feature extraction network is MobileNetV2. To recover clearer target boundaries, a lightweight convolutional attention (CBAM) module is introduced in the output part of the atrous spatial pyramid pooling (ASPP) or channel attention (ECA-Net) in the decod- ing part. To solve the sample imbalance problem, the focal_loss loss function is introduced. Mixed accuracy is used, and the standard convolution in the output section is replaced with DO-Conv convolution. Experiments and validations are conducted on the PASCAL VOC2012 and Cityscapes datasets. The model size is 23.6 MB, with mean intersection over union (mIoU) scores of 73.91% and 74.89%, and class-wise pixel accuracy scores of 82.88% and 84.87% respectively. This successfully achieves a balance between accurate segmentation and computational efficiency.


Key words: semantic segmentation;DeepLabV3+;MobileNetV2;CBAM;channel , attention