J4 ›› 2012, Vol. 34 ›› Issue (7): 78-83.
• 论文 • Previous Articles Next Articles
QIN Jin,GONG Chunye,HU Qingfeng,LIU Jie
Received:
Revised:
Online:
Published:
Abstract:
Sparse matrixvector multiplication is often an important computational kernel in many scientific applications. This paper faces the ndiagonal sparse matrix, uses the CUDA program model and describes a new compress format of sparse matrix based on the DIA compress format (CDIA), and gives each thread finegrained task distribution. In order to fulfill the characteristics of the align access of memory in CUDA, we transpose the compress matrix and design a finegrained algorithm and program and do some optimization to the program. In the data experiment, our best implementation achieves up to 39.6Gflop/s in singleprecision and 19.6Gflop/s in doubleprecision, and enhances the performance by about 7.6% and 17.4% that of Nathan Bell’s and Michael Garland’s respectively.
Key words: GPU;CDIA;CUDA;sparse matrixvector multiplication
QIN Jin,GONG Chunye,HU Qingfeng,LIU Jie. Optimization of Sparse Diagonal MatrixVector Multiplication Based on the CUDA Program Model[J]. J4, 2012, 34(7): 78-83.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2012/V34/I7/78