• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (10): 1754-1762.

• 高性能计算 • 上一篇    下一篇

基于HYB格式SpMV在新一代申威架构上的实现与优化

王鑫,彭健   

  1. (江南大学物联网工程学院,江苏 无锡 214122)
  • 收稿日期:2022-10-13 修回日期:2023-01-03 接受日期:2023-10-25 出版日期:2023-10-25 发布日期:2023-10-17

Implementation and optimization of HYB-based SpMV on the new-generation Sunway architecture

WANG Xin,PENG Jian   

  1. (School of Internet of Things Engineering,Jiangnan University,Wuxi 214122,China)
  • Received:2022-10-13 Revised:2023-01-03 Accepted:2023-10-25 Online:2023-10-25 Published:2023-10-17

摘要: 稀疏矩阵与稠密向量乘SpMV在高性能计算领域有着广泛的应用。稀疏矩阵因其非零元素分布的稀疏性和不规则性,使得运算的并行化较稠密矩阵难度更大。因此,稀疏矩阵向量乘法的性能优化一直都是高性能计算领域中的研究重点。基于稀疏矩阵的HYB存储格式,面向国产新一代申威异构众核处理器SW26010P,设计了一种并行SpMV算法及其性能优化方案。并针对HYB存储格式的阈值选取难点,提出了一种多次迭代最大类间方差的方法,以确定HYB格式的阈值。实验结果表明,相比主核上的串行算法,并行SpMV算法可以获得23.36的平均加速比和34.85的最高加速比。

关键词: 申威众核处理器, 稀疏矩阵向量乘法, 最大类间方差法, 并行计算

Abstract: Sparse matrix vector multiplication (SpMV) is widely used in high performance computing. The parallelization of sparse matrix is more difficult than that of dense matrix because of the sparse and irregular distribution of non-zero elements. Therefore, the performance optimization of sparse matrix vector multiplication has always been the research focus in the field of high performance computing. A parallel SpMV algorithm and performance optimization scheme based on the HYB storage format of sparse matrices is designed for the new generation of domestic heterogeneous many-core processor SW26010P. Moreover, considering the difficulty of threshold selection in HYB storage format, a multi-iteration Otsu method is proposed to determine the threshold of HYB. The experimental results show that our design can achieve an average speedup of 23.36 and the best speedup of 34.85, compared with the sequential method on the Main Processing Element (MPE) of SW26010P.

Key words: Sunway many-core architecture, sparse matrix vector multiplication (SpMV), Otsu method, parallel computing