• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 高性能计算 • 上一篇    下一篇

基于无锁原子操作的多线程并行Delaunay三角化算法

王俊吉1,2,朱朝艳1,2,4,陈建军1,2,郑澎3,5,徐权3   

  1. (1.浙江大学工程与科学计算研究中心,浙江 杭州 310027;2.浙江大学航空航天学院,浙江 杭州 310027;
    3.中国工程物理研究院高性能数值模拟软件中心,北京 100088;4.浙江大学宁波理工学院,浙江 宁波 315100;
    5.中国工程物理研究院计算机应用研究所,四川 绵阳 621900)
  • 收稿日期:2017-11-02 修回日期:2018-02-11 出版日期:2018-05-25 发布日期:2018-05-25
  • 基金资助:

    科学挑战专题(TZ2016002)

A multithreaded parallel Delaunay triangulation
algorithm based on lock-free atomic operations

WANG Jun-ji1,2,ZHU Chao-yan1,2,4,CHEN Jian-jun1,2,ZHENG Peng3,5,XU Quan3   

  1. (1.Center for Engineering and Scientific Computation,Zhejiang University,Hangzhou 310027;
    2.School of Aeronautics and Astronautics,Zhejiang University,Hangzhou 310027;
    3.Software Center for High Performance Numerical Simulation,China Academy of Engineering Physics,Beijing 100088;
    4.Ningbo Institute of Technology,Zhejiang University,Ningbo 315100;
    5.Institute of Computer Application,China Academy of Engineering Physics,Mianyang 621900,China)
  • Received:2017-11-02 Revised:2018-02-11 Online:2018-05-25 Published:2018-05-25

摘要:

基于OpenMP实现了一种基于空腔交叠互斥准则与无锁原子操作的Delaunay三角化增量插点细粒度并行算法。在串行算法的基础上,对点集引入Hilbert排序,使相邻点在几何上亦相邻。引入互斥机制——仅当各空腔无公共单元及公共相邻边时,才可同时插入,根据Delaunay局部性准则可保证整个网格都具备Delaunay属性。每个单元用一个原子变量标记该单元是否已被占有,在计算Delaunay空腔时,各线程将试图写入该原子变量,但本竞争机制保证有且仅有一个线程能成功获得该单元的所有权,以保证算法的互斥性。经数值实验表明,对于107的点集,该算法在16核下加速比可达7.06倍。

关键词: Delaunay三角化, 网格生成, 多线程并行算法, 并行计算, OpenMP, 原子操作

Abstract:

 

 
This paper uses OpenMP to implement a fine-grain parallel incremental insertion algorithm for Delaunay triangulation, which adopts an exclusive criterion of cavity overlapping and lock-free atomic operations. Based on the serial algorithm, Hilbert sorting is introduced into the point set so that adjacent points are also geometrically adjacent. An exclusive criterion that the points can be inserted simultaneously only if their cavities share neither common elements nor common boundaries is introduced to guarantee that the whole mesh has the Delaunay property according to the Delaunay lemma. Each element uses an atomic variable to mark whether the element is occupied by any of the threads. While calculating the Delaunay cavities, threads try to occupy those elements, but only one of them can succeed to write the atomic variable, so the exclusivity of the algorithm is guaranteed. Numerical experiments on the platform with a 16-core Intel Xeon CPU E5-2640 v3 @ 2.60GHz and a 64 GiB memory show that,for a 107-point set, the algorithm can reach the speedup of 7.06 with 16 cores.
 

Key words: Delaunay triangulation, mesh generation, multithreaded parallel algorithm, parallel computing, OpenMP, atomic operation