面向离散粒子多尺度分析CPU/GPU架构的并行近邻搜索算法

计算机工程与科学 ›› 2024, Vol. 46 ›› Issue (08): 1349-1360.

面向离散粒子多尺度分析CPU/GPU架构的并行近邻搜索算法

代长威1,孔瑞林1,季哲1，2，3

（1.西北工业大学软件学院，陕西西安 710129；2.西北工业大学太仓长三角研究院，江苏苏州 215400；
3.西北工业大学深圳研究院，广东深圳 518063）

收稿日期:2023-12-08 修回日期:2024-03-18 接受日期:2024-08-25 出版日期:2024-08-25 发布日期:2024-09-02
基金资助:
中央高校基本科研业务费专项资金(D5000210971);广东省基础与应用基础研究基金(2022A1515110314)

A parallel fast neighbor searching algorithm for particle-based methods on CPU and GPU architectures in multi-scale simulation

DAI Chang-wei1,KONG Rui-lin1,JI Zhe1,2,3

(1.School of Software,Northwestern Polytechnical University,Xi’an 710129;
2.Yangtze River Delta Research Institute,Northwestern Polytechnical University,Suzhou 215400;
3.Shenzhen Research Institute,Northwestern Polytechnical University,Shenzhen 518063,China)

Received:2023-12-08 Revised:2024-03-18 Accepted:2024-08-25 Online:2024-08-25 Published:2024-09-02

摘要/Abstract

摘要： 离散粒子法在解决前沿科学和工程领域中的复杂多尺度问题中具有广泛的应用。针对离散粒子大规模多尺度计算中相邻粒子对搜索过程计算复杂度显著增加和并发度下降的问题，提出了一种适用于众核架构（CPU/GPU）的高并发、低内存占用并行近邻搜索算法。通过提出一种基于多层嵌套网格概念的层间相互作用方法，解决了不同层级间粒子对相互搜索时的数据竞争问题；通过引入非对称映射方法，避免了粒子在多级链表上的全映射，降低了内存消耗。一系列数值实验表明，该算法可有效处理108量级粒子体积跨度变化的多尺度问题，相较于传统算法可取得2~8倍的加速效果和更低的内存消耗特性，基于GPU的算法实现可达到当前领先的计算效率。

关键词: 离散粒子法, 多尺度分析, 近邻搜索, 并行算法

Abstract: Particle-based methods are widely applied in the resolving of complex multi-scale physical phenomena in various science and engineering areas. In order to handle the challenge of increasing computational complexity and declining concurrency for the pair-wised particle searching procedure in massive multi-scale particle-based simulations, a new parallel fast neighbor searching algorithm, which features high-concurrency and low memory footprint, is developed and demonstrated on both many-core CPU and GPU architectures. An inter-level interaction strategy based on the concept of hierarchical nested data structure is proposed to resolve the issue of racing condition in cross-level particle search. An asymmetric mapping method is developed to eliminate the full mapping of particles on each level, which reduces the memory consumption. A set of numerical experiments show that, the proposed algorithm can handle multi-scale problems with particle volume ratio up to 108. Compared with traditional algorithm, the proposed algorithm can achieve 2x~8x speedups and lower memory consumption. The GPU-based implementation of the algorithm achieves state-of-the-art computational efficiency.

Key words: particle-based method, multi-scale simulation, fast neighbor searching, parallel computing

代长威, 孔瑞林, 季哲, . 面向离散粒子多尺度分析CPU/GPU架构的并行近邻搜索算法[J]. 计算机工程与科学, 2024, 46(08): 1349-1360.

DAI Chang-wei, KONG Rui-lin, JI Zhe, . A parallel fast neighbor searching algorithm for particle-based methods on CPU and GPU architectures in multi-scale simulation[J]. Computer Engineering & Science, 2024, 46(08): 1349-1360.

编辑推荐

Metrics

阅读次数

全文

400

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	400

来源	本网站	其他网站

次数	252	148
比例	63%	37%

摘要

182

最新录用	在线预览	正式出版

0	0	182

	来源	本网站

	次数	182
	比例	100%

[1]	华悦琳, 周晓磊, 范强, 王芳潇, 严浩, . 基于分区层次图的海量高维数据学习索引构建方法[J]. 计算机工程与科学, 2024, 46(07): 1193-1201.
[2]	汪静, 钱晓东. 区块链环境中基于局部敏感哈希的协同过滤推荐研究[J]. 计算机工程与科学, 2022, 44(03): 436-446.
[3]	孙浩军，张崇锐，张磊，李惊涛. 基于小波分析和Gaussian回归的急性低血压预测[J]. J4, 2016, 38(01): 156-162.
[4]	范文婷，陈秀宏. 基于子向量距离索引的特征匹配算法的改进[J]. J4, 2015, 37(04): 754-759.
[5]	李玉鑑，王影，冷强奎. 基于最近邻子空间搜索的两类文本分类方法[J]. J4, 2015, 37(01): 168-172.
[6]	陈加忠周敬利. 仅利用细节部分实现图象重构的方法研究[J]. J4, 2000, 22(1): 4-6.