• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (4): 695-705.

• Graphics and Images • Previous Articles     Next Articles

An improved marine animal object detection algorithm based on YOLOv8n: DPSC-YOLO

LIANG Jiajie,XU Huiying,ZHU Xinzhong,WANG Shumeng,LIU Ziyang,LI Chen   

  1. (School of Computer Science and Technology(School of Aritficial Intelligence),Zhejiang Normal University,Jinhua 321004,China)
  • Received:2023-08-05 Revised:2024-05-09 Online:2025-04-25 Published:2025-04-17

Abstract: In the complex marine environment, deep learning-based object detection algorithms face challenges such as difficulty in feature extraction and missed detection due to blurred images capture and complex backgrounds. Therefore, marine object detection algorithms need to be more efficient and superior in performance. To address this, an improved marine animal detection algorithm based on YOLOv8n, named DPSC-YOLO, is proposed. The DCNv2 module is introduced into the backbone network to adapt to geometric variations of objects by enhancing spatial modeling capabilities. Spatial pyramid pooling faster cross stage partial channel (SPPFCSPC) is added at the end of the backbone network to reduce computational complexity while maintaining the models receptive field. An F2 small object detection head is added to the neck network, combined with the other three scales, using four different receptive field detection layers to improve the accuracy of extremely small object detection. The CoT- Attention mechanism is integrated into the C2f module of the neck network to better utilize contextual information between adjacent keys and dynamically adjust attention allocation based on data characteristics. Experimental results show that DPSC-YOLO improves mAP@0.5 by 1.1% and mAP@0.5:0.95 by 4.6% compared to YOLOv8n, with only a slight increase in parameters and computational com- plexity. This proves that DPSC-YOLO is more suitable for object detection tasks in complex marine environment.

Key words: you only look once version 8(YOLOv8), deformable ConvNets v2(DCNv2), spatial pyramid pooling faster cross stage partial channel(SPPFCSPC), contextual Transformer attention(CoTAttention), small object detection head