Mobile monocular depth estimation based on multi-scale feature fusion

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (09): 1616-1524.

• Graphics and Images • Previous Articles Next Articles

Mobile monocular depth estimation based on multi-scale feature fusion

CHEN Lei1,LIANG Zheng-you1,2,SUN Yu1,CAI Jun-min1

(1.School of Computer and Electronics Information,Guangxi University,Nanning 530004;
2.Guangxi Key Laboratory of Multimedia Communications and Network Technology,Nanning 530004,China)

Received:2023-05-10 Revised:2023-11-21 Accepted:2024-09-25 Online:2024-09-25 Published:2024-09-23

Abstract

Abstract: The current depth estimation model based on depth learning has a large number of parameters, which is difficult to adapt to mobile devices. To address this issue, a lightweight depth estimation method with multi-scale feature fusion that can be deployed on mobile devices is proposed. Firstly, MobileNetV2 is used as the backbone to extract features of four scales. Then, by constructing skip connection paths from the encoder to the decoder, the features of the four scales are fused, fully utilizing the combined positional information from lower layers and semantic information from higher layers. Finally, the fused features are processed through convolutional layers to produce high-precision depth images. After training and testing on NYU Depth Dataset V2, the experimental results show that the proposed model achieves advanced performance with an evaluation index of δ1 up to 0.812 while only having 1.6×106 parameters numbers. Additionally, it only takes 0.094 seconds to infer a single image on the Kirin 980 CPU of a mobile device, demonstrating its practical application value.

Key words: deep learning, depth estimation, multi-scale feature, lightweight network, mobile terminal model

CHEN Lei, LIANG Zheng-you, SUN Yu, CAI Jun-min. Mobile monocular depth estimation based on multi-scale feature fusion[J]. Computer Engineering & Science, 2024, 46(09): 1616-1524.

[1]	LIU Qiang, LI Mu-chun, WU Xiao-jie, WANG Yu-heng. S-JSMA: A fast JSMA adversarial example generation method with low disturbance redundancy [J]. Computer Engineering & Science, 2024, 46(08): 1395-1402.
[2]	HU Zhao-hua, WANG Chang-fu, . A small object detection algorithm of remote sensing image based on improved Faster R-CNN [J]. Computer Engineering & Science, 2024, 46(06): 1063-1071.
[3]	DENG Xiang-yu, PEI Hao-yuan, SHENG Ying. Facial expression recognition based on network fusion to improve MobileViT [J]. Computer Engineering & Science, 2024, 46(06): 1072-1080.
[4]	HUANG Zhen-wei, CHEN Wei, WANG Wen-jie, LU Jin-tong. Underwater vehicle target detection and experiment based on improved RetinaNet network [J]. Computer Engineering & Science, 2024, 46(02): 264-271.
[5]	ZHANG Wen-hao, QU Shao-jun. Retinal vessel segmentation based on multi-scale attention feature fusion network with dual-decoder structure [J]. Computer Engineering & Science, 2023, 45(12): 2175-2185.
[6]	LIANG Xiu-man, ZHOU Jia-run, YANG Ruo-lan. LPD-YOLO:Lightweight obscured pedestrian detection model [J]. Computer Engineering & Science, 2023, 45(12): 2197-2205.
[7]	LI Zhuo-xuan, ZHOU Ya-tong. iSFF-DBNet:An improved text detection algorithm in e-commerce images [J]. Computer Engineering & Science, 2023, 45(11): 2008-2017.
[8]	TANG Jian, CHE Wen-gang, GAO Sheng-xiang. An image dehazing method based on multi-scale convolution with attention mechanism [J]. Computer Engineering & Science, 2023, 45(08): 1453-1462.
[9]	YI Xiao, MA Sheng, XIAO Nong. Running optimization of deep learning accelerators under different pruning strategies [J]. Computer Engineering & Science, 2023, 45(07): 1141-1148.
[10]	KANG Yu-han, SHI Yang, CHEN Zhao-yun, WEN Mei. A deep learning programming framework for FT-Matrix DSP+MatrixZone heterogeneous systems [J]. Computer Engineering & Science, 2023, 45(07): 1149-1158.
[11]	LIU Hao-han, SUN Cheng, HE Huai-qing, HUI Kang-hua. Metal surface defect detection based on improved YOLOv3 [J]. Computer Engineering & Science, 2023, 45(07): 1226-1235.
[12]	CUI Ke-bin, CUI Ye-wei. A circuit breaker moving contact tracking methods based on convolution and Transformer [J]. Computer Engineering & Science, 2023, 45(07): 1236-1244.
[13]	LI Xiao-lin, WANG Fu-gang, ZHANG Peng-fei, ZHANG Lin-yu, . YOLOv5s algorithm optimization based on multi-scale feature extraction [J]. Computer Engineering & Science, 2023, 45(06): 1054-1062.
[14]	YAN Chun-man, ZHANG Xiang, WANG Qing-peng. Facial expression recognition based on improved MobileNetV2 [J]. Computer Engineering & Science, 2023, 45(06): 1071-1078.
[15]	SUN Qi, ZHAI Rui, ZUO Fang, ZHANG Yu-tao, . Facial image inpainting based on partial convolution and multi-scale feature integration [J]. Computer Engineering & Science, 2023, 45(02): 304-312.

Mobile monocular depth estimation based on multi-scale feature fusion

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments