• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (03): 381-394.

• High Performance Computing • Previous Articles     Next Articles

DRM: A GPU-parallel SpMV storage format based on iterative merge strategy

WANG Yu-hua1,2,HE Jun-fei1,ZHANG Yu-qi1,XU Yue-zhu1,CUI Huan-yu1   

  1. (1.College of Computer Science and Technology,Harbin Engineering University,Harbin 150001;
    2.Modeling and Emulation in E-Government National Engineering Laboratory,Harbin 150001,China)
  • Received:2023-07-14 Revised:2023-09-18 Accepted:2024-03-25 Online:2024-03-25 Published:2024-03-15

Abstract: Sparse matrix vector multiplication (SpMV) is of great significance in the solution of linear systems, and is one of the core problems in scientific computing and engineering practice. Its performance highly depends on the non-zero distribution of sparse matrices. Sparse diagonal matrices are a special type of sparse matrices, whose non-zero elements are densely arranged in the form of diagonals. For sparse diagonal matrices, scholars have proposed various storage formats on the GPU platform, which have improved SpMV performance, but still suffer from zero padding and load imbalance issues. To address these issues, a DRM (Divide-Rearrange & Merge) storage format is proposed. This format uses matrix partitioning strategies based on fixed threshold values and matrix reconstruction strategies based on iterative merging to achieve sparse zero padding and load balancing between blocks. Experimental results show that on the NVIDIA Tesla V100 platform, compared to DIA, HDC, HDIA, and DIA-Adaptive formats, the time performance is accelerated  by 20.76, 1.94, 1.13, and 2.26 times, respectively, and the floating point performance is improved by 1.54, 5.28, 1.13, and 1.94 times, respectively.

Key words: GPU, SpMV, sparse diagonal matrix, zero padding, load balancing