[1] |
Feldman M.China spills details on exascale prototypes[EB/OL].[2019-06-10].https://www.top500.org/news/china-spills-details-on-exascale-prototypes.
|
[2] |
Tang W,Wang B,Ethier S.Scientific discovery in fusion plasma turbulence simulations at extreme scale[J].Computing in Science & Engineering,2014,16(5): 44-52.
|
[3] |
Ethier S,Chang C S,Ku S H,et al.NERSC’s impact on advances of global gyrokinetic PIC codes for fusion energy research[J].Computing in Science & Engineering,2015,17(3):10-21.
|
[4] |
Wang Yi-chao,Lin Xin-hua,Cai Lin-jin,et al.Porting and optimizing GTC-P on TaihuLight supercomputer with OpenACC[J].Journal of Computer Research and Development,2018,55(4): 875-884.(in Chinese)
|
[5] |
Wei Y, Wang Y,Cai L,et al.Performance and portability studies with OpenACC accelerated version of GTC-P[C]∥Proc of International Conference on Parallel & Distributed Computing,2016:13-18.
|
[6] |
Cheng J.CUDA by example: An introduction to general-purpose GPU programming[M].New York:Addison-Wesley Professional,2010.
|
[7] |
Dongarra J J,Bunch J R,Moler C B, et al. LINPACK users’ guide[M].New York:Society for Industrial & Applied Mathematics,1979.
|
[8] |
McCalpin J D. Memory bandwidth and machine balance in current high performance computers[J].IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter,1995,2:19-25.
|
[9] |
McVoy L W, Staelin C.LMbench: Portable tools for performance analysis[C]∥Proc of USENIX Annual Technical Conference,1996: 279-294.
|
[10] |
ROCm, a new era in open GPU computing[EB/OL].[2019-06-10].https://rocm.github.io.
|
[11] |
Saini S,Hood R,Chang J,et al.Performance evaluation of an Intel Haswell- and Ivy bridge-based supercomputer using scientific and engineering applications[C]
|
|
∥Proc of 2016 IEEE 18th International Conference on High Performance Computing and Communications,IEEE 14th International Conference on Smart City,IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS),2016: 1196-1203.
|
[12] |
McCormick P S,Braithwaite R K,Feng W.Empirical memory-access cost models in multicore NUMA architectures[C]∥Proc of International Conference on Parallel Processing (ICPP),2011:1.
|
[13] |
Wang B,Ethier S,Tang W,et al.Kinetic turbulence simulations at extreme scale on leadership-class systems[C]∥Proc of the International Conference on High Performance Computing,Networking,Storage and Analysis,2013:Article No.82.
|
|
附中文参考文献:
|
[4] |
王一超,林新华,蔡林金,等.太湖之光上利用 OpenACC 移植和优化 GTC-P[J].计算机研究与发展,2018,55(4): 875-884.
|