• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (08): 1349-1360.

• 高性能计算 • 上一篇    下一篇

面向离散粒子多尺度分析CPU/GPU架构的并行近邻搜索算法

代长威1,孔瑞林1,季哲1,2,3   

  1. (1.西北工业大学软件学院,陕西  西安 710129;2.西北工业大学太仓长三角研究院,江苏 苏州 215400;
    3.西北工业大学深圳研究院,广东 深圳 518063)

  • 收稿日期:2023-12-08 修回日期:2024-03-18 接受日期:2024-08-25 出版日期:2024-08-25 发布日期:2024-09-02
  • 基金资助:
    中央高校基本科研业务费专项资金(D5000210971);广东省基础与应用基础研究基金(2022A1515110314) 

A parallel fast neighbor searching algorithm for particle-based methods on CPU and GPU architectures in multi-scale simulation

DAI Chang-wei1,KONG Rui-lin1,JI Zhe1,2,3   

  1. (1.School of Software,Northwestern Polytechnical University,Xi’an 710129;
    2.Yangtze River Delta Research Institute,Northwestern Polytechnical University,Suzhou 215400;
    3.Shenzhen Research Institute,Northwestern Polytechnical University,Shenzhen 518063,China)
  • Received:2023-12-08 Revised:2024-03-18 Accepted:2024-08-25 Online:2024-08-25 Published:2024-09-02

摘要: 离散粒子法在解决前沿科学和工程领域中的复杂多尺度问题中具有广泛的应用。针对离散粒子大规模多尺度计算中相邻粒子对搜索过程计算复杂度显著增加和并发度下降的问题,提出了一种适用于众核架构(CPU/GPU)的高并发、低内存占用并行近邻搜索算法。通过提出一种基于多层嵌套网格概念的层间相互作用方法,解决了不同层级间粒子对相互搜索时的数据竞争问题;通过引入非对称映射方法,避免了粒子在多级链表上的全映射,降低了内存消耗。一系列数值实验表明,该算法可有效处理108量级粒子体积跨度变化的多尺度问题,相较于传统算法可取得2~8倍的加速效果和更低的内存消耗特性,基于GPU的算法实现可达到当前领先的计算效率。

关键词: 离散粒子法, 多尺度分析, 近邻搜索, 并行算法

Abstract: Particle-based methods are widely applied in the resolving of complex multi-scale physical phenomena in various science and engineering areas. In order to handle the challenge of increasing computational complexity and declining concurrency for the pair-wised particle searching procedure in massive multi-scale particle-based simulations, a new parallel fast neighbor searching algorithm, which features high-concurrency and low memory footprint, is developed and demonstrated on both many-core CPU and GPU architectures. An inter-level interaction strategy based on the concept of hierarchical nested data structure is proposed to resolve the issue of racing condition in cross-level particle search. An asymmetric mapping method is developed to eliminate the full mapping of particles on each level, which reduces the memory consumption. A set of numerical experiments show that, the proposed algorithm can handle multi-scale problems with particle volume ratio up to 108. Compared with traditional algorithm, the proposed algorithm can achieve 2x~8x speedups and lower memory consumption. The GPU-based implementation of the algorithm achieves state-of-the-art computational efficiency.

Key words: particle-based method, multi-scale simulation, fast neighbor searching, parallel computing