Optimization of Sparse Diagonal MatrixVector Multiplication Based on the CUDA Program Model

J4 ›› 2012, Vol. 34 ›› Issue (7): 78-83.

• 论文 • Previous Articles Next Articles

Optimization of Sparse Diagonal MatrixVector Multiplication Based on the CUDA Program Model

QIN Jin,GONG Chunye,HU Qingfeng,LIU Jie

（School of Computer Science,National University of Defense Technology,Changsha 410073,China）

Received:2010-05-26 Revised:2010-08-20 Online:2012-07-25 Published:2012-07-25

Abstract

Abstract:

Sparse matrixvector multiplication is often an important computational kernel in many scientific applications. This paper faces the ndiagonal sparse matrix, uses the CUDA program model and describes a new compress format of sparse matrix based on the DIA compress format (CDIA), and gives each thread finegrained task distribution. In order to fulfill the characteristics of the align access of memory in CUDA, we transpose the compress matrix and design a finegrained algorithm and program and do some optimization to the program. In the data experiment, our best implementation achieves up to 39.6Gflop/s in singleprecision and 19.6Gflop/s in doubleprecision, and enhances the performance by about 7.6% and 17.4% that of Nathan Bell’s and Michael Garland’s respectively.

Key words: GPU;CDIA;CUDA;sparse matrixvector multiplication

QIN Jin,GONG Chunye,HU Qingfeng,LIU Jie. Optimization of Sparse Diagonal MatrixVector Multiplication Based on the CUDA Program Model[J]. J4, 2012, 34(7): 78-83.

Optimization of Sparse Diagonal MatrixVector Multiplication Based on the CUDA Program Model

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 0

Recommended Articles

Metrics

Comments