[1] |
Chen C,Xiang X,Liu C,et al. Xuantie-910:A commercial multi-core 12-stage pipeline out-of-order 64-bit high performance RISC-V processor with vector extension:Industrial product[C]∥Proc of 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture,2020:52-64.
|
[2] |
Leidel J D, Wang X,Chen Y. GoblinCore-64:A RISC-V based architecture for data intensive computing[C]∥Proc of 2018 IEEE High Performance Extreme Computing
|
|
Confe- rence,2018:1-8.
|
[3] |
Johns M,Kazmierski T J. A minimal RISC-V vector processor for embedded systems[C]∥Proc of 2020 Forum for Specification and Design Languages,2020:1-4.
|
[4] |
RISC-V instruction set manual volume I:Unprivileged ISA[EB/OL]. [2019-06-08]. https://riscv.org/specifications.
|
[5] |
Working draft of the proposed RISC-V V vector extension[EB/OL]. [2020-05-16]. https://github.com/riscv/riscv-v-spec.
|
[6] |
Nested-parallelism PageRank on RISC-V vector multi processor[EB/OL]. [2019-04-19]. https://digitalassets.lib.berkeley.edu/techreports/ucb/text/EECS-2019-6.pdf.
|
[7] |
Pohl A,Greese M,Cosenza B,et al. A performance analysis of vector length Agnostic code[C]∥Proc of 2019 International Conference on High Performance Computing & Simulation,2019:159-164.
|
[8] |
Liu Fang-fang, Yang Chao,Yuan Xin-hui,et al. General SpMV implementation in many-core domestic Sunway 26010 processor[J]. Journal of Software,2018,29(12):3921-3932.(in Chinese)
|
[9] |
Yang Wang-dong,Li Ken-li. Implementation and optimization of HYB based SpMV on CPU+GPU heterogeneous computing systems[J]. Computer Engineering & Science,2016,38(2):202-209. (in Chinese)
|
[10] |
Li Jia-jia,Zhang Xiu-xia,Tan Guang-ming,et al. Study of choosing the optimal storage format of sparse matrix vector multiplication[J]. Journal of Computer Research and Development,2014,51(4):882-894. (in Chinese)
|
[11] |
Li Yi-yuan,Xue Wei,Chen De-xun,et al. Performance optimization for sparse matrix-vector multiplication on Sunway architecture [J]. Chinese Journal of Computers,2020,43(6):1010-1024. (inChinese)
|
[12] |
Gómez C,Casas M, Mantovani F,et al. Optimizing sparse matrix-vector multiplication in NEC SX-Aurora vector engine:UPC/51.310E[R]. Barcelona:Barcelona Supercomputing Center,2020.
|
[13] |
Bell N,Garland M. Implementing sparse matrix-vector multiplication on throughput-oriented processors[C]∥Proc of Conference on High Performance Computing Networking,2009:1-11.
|
[14] |
Monakov A, Lokhmotov A,Avetisyan A. Automatically tuning sparse matrix-vector multiplication for GPU architectures[C]∥Proc of International Conference on High-Performance Embedded Architectures and Compilers,2010:111-125.
|
[15] |
Kreutzer M,Hager G,Wellein G,et al. A unified sparse matrix data format for efficient general sparse matrix-vector multiply on modern processors with wide SIMD units[J]. SIAM Journal on Scientific Computing,2014,36(5):C401-C423.
|
[16] |
Liu X, Smelyanskiy M,Chow E,et al. Efficient sparse matrix-vector multiplication on x86-based many-core processors[C]∥Proc of the 27th International ACM Conference on International Conference on Supercomputing,2013:273-282.
|
[17] |
Liu W F, Brian V. CSR5:An efficient storage format for cross-platform sparse matrix-vector multiplication[C]∥Proc of the 29th ACM International Conference on Supercomputing,2015:339-350.
|
[18] |
Spike,a RISC-V ISA simulator[EB/OL]. [2019-04-01]. https://github.com/riscv/riscv-isa-sim.
|
|
附中文参考文献:
|
[8] |
刘芳芳,杨超,袁欣辉,等. 面向国产申威26010众核处理器的SpMV实现与优化[J]. 软件学报,2018,29(12):3921-3932.
|
[9] |
阳王东,李肯立. 基于HYB格式稀疏矩阵与向量乘在CPU+GPU异构系统中的实现与优化[J]. 计算机工程与科学,2016,38(2):202-209.
|
[10] |
李佳佳,张秀霞,谭光明,等. 选择稀疏矩阵乘法最优存储格式的研究[J]. 计算机研究与发展,2014,51(4):882-894.
|
[11] |
李亿渊,薛巍,陈德训,等. 稀疏矩阵向量乘法在申威众核架构上的性能优化[J]. 计算机学报,2020,43(6):1010-1024.
|