基于路径聚合扩张卷积的图像语义分割方法

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (04): 712-720.

基于路径聚合扩张卷积的图像语义分割方法

李叔敖，解庆，马艳春，刘永坚

（武汉理工大学计算机科学与技术学院,湖北武汉 430070）

收稿日期:2020-01-15 修回日期:2020-06-18 接受日期:2021-04-25 出版日期:2021-04-25 发布日期:2021-04-21
基金资助:
国家自然科学基金（61602353）

An image semantic segmentation method based on path aggregation Atrous convolutional network

LI Shu-ao,XIE Qing,MA Yan-chun,LIU Yong-jian

（School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430070,China）

Received:2020-01-15 Revised:2020-06-18 Accepted:2021-04-25 Online:2021-04-25 Published:2021-04-21

摘要/Abstract

摘要： 基于编码器-解码器的深度全卷积神经网络在图像语义分割中取得了重大的进展，但是深度网络中网络低层定位信息传播到网络高层路径过长，导致解码阶段难以利用低层定位信息来恢复物体边界结构，针对这一问题，提出了一种应用在分割网络解码器部分的路径聚合结构。该结构缩短了分割网络中低层信息到高层信息的传播路径并提供多尺度的上下文语义信息，使得分割网络能产生更为精细的边界分割结果。针对语义分割中常使用的Softmax交叉熵损失函数对外观相似样本区分能力不足的问题，对Softmax交叉熵损失函数进行改造，提出了双向交叉熵损失函数。本文提出的路径聚合扩张卷积网络结合新的损失函数方法在PASCAL VOC2012Aug数据集上获得了更好的效果，将mIoU值从78.77%提升到了80.44%。

关键词: 图像语义分割, 双向交叉熵, 路径聚合结构, 多尺度预测, 深度学习

Abstract: The deep full convolutional neural network based on encoder-decoder structure has made significant progress in image semantic segmentation. However, the path of transferring low-level positioning information in the deep network to the high-level network is too long, which makes it difficult to use low-level positioning information in the decoder stage to restore the boundary structure of the object. Aiming at this problem, a path aggregation structure used in the decoder part of segmentation network is proposed. This structure shortens the propagation path of low-level information to high-level information in the segmentation network and provides multi-scale contextual semantic information, so that the segmentation network can produce more refined boundary segmentation results. Aiming at the pro- blem that the softmax cross-entropy loss function often used in semantic segmentation is insufficient to distinguish samples with similar appearance, this paper reforms the softmax cross-entropy loss function and proposes a bidirectional cross-entropy loss function. Combining the proposed path aggregation Atrous convolutional network with the new loss function method can obtain better results on the PASCAL VOC2012Aug data set, which increases the mIoU value from 78.77% to 80.44%.

Key words: semantic , image segmentation;bidirectional cross-entropy;path aggregation structure;multi-scale prediction;deep learning

李叔敖, 解庆, 马艳春, 刘永坚. 基于路径聚合扩张卷积的图像语义分割方法[J]. 计算机工程与科学, 2021, 43(04): 712-720.

LI Shu-ao, XIE Qing, MA Yan-chun, LIU Yong-jian. An image semantic segmentation method based on path aggregation Atrous convolutional network[J]. Computer Engineering & Science, 2021, 43(04): 712-720.

[1]	罗婧, 叶志晟, 杨泽华, 傅天豪, 魏雄, 汪小林, 罗英伟, . 研发类GPU集群任务数据集的构建及分析[J]. 计算机工程与科学, 2024, 46(12): 2128-2137.
[2]	敬超, 闭玉申. 面向深度学习作业的干扰感知在线调度算法研究[J]. 计算机工程与科学, 2024, 46(12): 2138-2148.
[3]	徐欣, 李若诗, 袁野, 刘娜. 基于可学习图像滤波器的雾天驾驶场景图像语义分割[J]. 计算机工程与科学, 2024, 46(11): 2027-2034.
[4]	陈磊, 梁正友, 孙宇, 蔡俊民. 多尺度特征融合的移动端单目深度估计研究[J]. 计算机工程与科学, 2024, 46(09): 1616-1524.
[5]	刘强, 李沐春, 伍晓洁, 王煜恒. S-JSMA：一种低扰动冗余的快速JSMA对抗样本生成方法[J]. 计算机工程与科学, 2024, 46(08): 1395-1402.
[6]	丁建平, 李卫军, 刘雪洋, 陈旭. 命名实体识别研究综述[J]. 计算机工程与科学, 2024, 46(07): 1296-1310.
[7]	胡昭华, 王长富, . 改进Faster R-CNN的遥感图像小目标检测算法[J]. 计算机工程与科学, 2024, 46(06): 1063-1071.
[8]	谭郁松, 王伟, 蹇松雷, 易超雄. 基于异常保持的弱监督学习网络入侵检测模型[J]. 计算机工程与科学, 2024, 46(05): 801-809.
[9]	高珊, 李世杰, 蔡志平. 基于深度学习的中文文本分类综述[J]. 计算机工程与科学, 2024, 46(04): 684-692.
[10]	罗月童, 李超, 周波, 张延孔. 面向工业缺陷分类的交互式易混淆缺陷分离方法研究[J]. 计算机工程与科学, 2024, 46(03): 463-470.
[11]	吕伏, 韩晓天, 冯永安, 项梁. 基于自适应纹理特征融合的纹理图像分类方法[J]. 计算机工程与科学, 2024, 46(03): 488-498.
[12]	吉旭瑞, 魏德健, 张俊忠, 张帅, 曹慧. 中文电子病历信息提取方法研究综述[J]. 计算机工程与科学, 2024, 46(02): 325-337.
[13]	黄泽彪, 董德尊, 齐星云. Gloo+：利用在网计算技术加速分布式深度学习训练[J]. 计算机工程与科学, 2024, 46(01): 28-36.
[14]	邱晓梦, 王琳, 谷文俊, 宋伟, 田浩来, 胡誉. 光流法修正的时序图像语义分割模型[J]. 计算机工程与科学, 2024, 46(01): 102-110.
[15]	崔浩, 万亚平, 钟华, 聂明星, 肖杨. 基于LoRa设备的人体活动识别研究[J]. 计算机工程与科学, 2024, 46(01): 111-121.