[1] |
Wan X H,Liu J Y,Liang W X,et al.Continual multi-view clustering[C]∥Proc of the 30th ACM International Confe- rence on Multimedia,2022:3676-3684.
|
[2] |
Yun Z H,Cui X H,Ma K G.Online Thevenin equivalent parameter identification method of large power grids using LU factorization[J].IEEE Transactions on Power Systems,2019,34(6):4464-4475.
|
[3] |
The TOP500 list [EB/OL].[2022-12-12].https://top500.org.
|
[4] |
Anderson E,Bai Z,Bischof C,et al.LAPACK users’ guide[M]. Philadelphia:Society for Industrial and Applied Mathematics,1999.
|
[5] |
Zhang P,Fang J B,Yang C Q,et al.Optimizing streaming parallelism on heterogeneous many-core architectures[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(8):1878-1896.
|
[6] |
Zhang X Y, Wang Q, Zhang Y Q.Model-driven level 3 BLAS performance optimization on Loongson 3A processor[C]∥Proc of 2012 IEEE 18th International Conference on Parallel and Distributed Systems,2012:684-691.
|
[7] |
Dongarra J,Gates M,Haidar A,et al.PLASMA:Parallel linear algebra software for multicore using OpenMP[J].ACM Transactions on Mathematical Software,2019,45(2):1-35.
|
[8] |
Eigen v3 [EB/OL].[2023-07-14].http://eigen.tuxfamily.org.
|
[9] |
Whaley R C.ATLAS (automatically tuned linear algebra software)[EB/OL].[2004-11-08].http://www.netlib.org/atlas/index.html.
|
[10] |
Intel MKL [EB/OL].[2022-12-12].https://software.intel.com/en-us/mkl.
|
[11] |
Elmroth E,Gustavson F,Jonsson I,et al.Recursive blocked algorithms and hybrid data structures for dense matrix library software[J].SIAM Review,2004,46(1):3-45.
|
[12] |
OpenMP standard [EB/OL].[2023-07-14].https://www.openmp.org/specifications/.
|
[13] |
Strazdins P E.A comparison of lookahead and algorithmic blocking techniques for parallel matrix factorization:TR-CS-98-07[R].Canberra:The Australian National University,1998.
|
[14] |
2nd generation Intel Xeon scalable processors [EB/OL].[2023-07-14].https://ark.intel.com/content/www/cn/zh/ark/products/193951/intel-xeon-gold-6252n-processor-35-75m-cache-2-30-ghz.html.
|
[15] |
Kunpeng 920 [EB/OL].[2023-07-14].https://www.hisilicon.com/en/products/Kunpeng.
|
[16] |
van Zee F G,Chan E,van de Geijn R A,et al.The libflame library for dense matrix computations[J].Computing in Science & Engineering,2009,11(6):56-63.
|
[17] |
Castaldo A M,Whaley R C.Scaling LAPACK panel operations using parallel cache assignment[C]∥Proc of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming,2010:223-232.
|
[18] |
Goto K,Geijn R A.Anatomy of high-performance matrix multiplication[J].ACM Transactions on Mathematical Software,2008,34(3):1-25.
|
[19] |
Hasan M R, Whaley R C.Effectively exploiting parallel scale for all problem sizes in LU factorization[C]∥Proc of 2014 IEEE 28th International Parallel and Distributed Processing Symposium,2014:1039-1048.
|