• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (07): 1158-1166.

• 高性能计算 • 上一篇    下一篇

MiniBranRAP:极小化分支判断数的AMG粗网格矩阵计算并行算法

杜皓1,2,毛润彰1,2,邓蕴桐1,2,黄思路2,徐小文2   

  1. (1.中国工程物理研究院研究生院,北京 100094;2.北京应用物理与计算数学研究所,北京 100088)

  • 收稿日期:2023-11-07 修回日期:2023-12-21 接受日期:2024-07-25 出版日期:2024-07-25 发布日期:2024-07-18
  • 基金资助:
    国家自然科学基金(62032023)

MiniBranRAP:A minimizing branch parallel algorithm of the coarse matrix computation in AMG solver

DU Hao1,2,MAO Run-zhang1,2,DENG Yun-tong1,2,HUANG Si-lu2,XU Xiao-wen2   

  1. (1.Graduate School of China Academy of Engineering Physics,Beijing 100094;
    2.Institute of Applied Physics and Computational Mathematics,Beijing 100088,China)
  • Received:2023-11-07 Revised:2023-12-21 Accepted:2024-07-25 Online:2024-07-25 Published:2024-07-18

摘要: 代数多重网格(AMG)是科学工程计算与工业仿真领域求解大规模稀疏线性代数方程组最常用的算法之一。在启动(Setup)阶段的每个网格层,AMG需要基于限制算子R、当前细网格层矩阵A和插值算子P的稀疏矩阵乘积来计算粗网格矩阵Ac=RAP,该过程是AMG并行性能的主要瓶颈。首先发现了主流AMG解法器中RAP并行算法由于分支判断的平方复杂度导致的性能瓶颈,并结合稀疏矩阵CSR的行主序特点,提出了具有线性复杂度分支判断数的RAP并行算法MiniBranRAP。该算法集成到JXPAMG解法器中,并通过实际应用算例验证了算法的有效性。测试结果表明,对于6个来自实际应用的典型算例,相对于Hypre最新版本的BoomerAMG解法器,基于MiniBranRAP的JXPAMG解法器在28个进程上将Setup阶段的计算效率平均加速3.3倍、最高加速9.3倍。

关键词: 代数多重网格(AMG), 粗网格矩阵计算, 分支判断, Hypre, JXPAMG

Abstract: Algebraic multi-grid (AMG) is one of the most commonly used algorithms for solving large-scale sparse linear algebra equations in the field of scientific engineering computing and industrial simulation. For each grid layer in the Setup phase, AMG needs to calculate the coarse grid matrix Ac=RAP  through the product of three sparse matrix based on the restriction operator R, the current fine grid matrix A, and the interpolation operator P, which has become the main bottleneck in the parallel performance of AMG. This paper first discovers that the performance bottleneck of the RAP parallel algorithm in mainstream AMG solvers is caused by the quadratic complexity of branch judgments. Then,  utilize the row-based order characteristics of the sparse matrix format CSR, and propose a RAP parallel algorithm called MiniBranRAP with linear complexity of branch judgment counts. The algorithm is integrated into the JXPAMG solver, and the effectiveness of the algorithm is verified through practical examples. The numerical test results show that, for 6 typical examples from practical applications, compared with the latest version of Hypre's BoomerAMG solver, the JXPAMG solver based on MiniBranRAP can speed up the computation efficiency of the Setup phase by an average of 3.3 times and a maximum of 9.3 times on 28 processors. 

Key words: algebraic multi-grid (AMG), coarse grid matrix computation, branch, Hypre, JXPAMG