[1] |
Chen S M,Wang Y H,Liu S,et al.FT-Matrix:A coordination- aware architecture for signal processing[J].IEEE Micro,2014,34(6):64-73.
|
[2] |
陈书明,刘胜,万江华,等.协同多核 DSP YHFT-QMBase:体系结构及实现[J].中国科学:信息科学,2015,45(4):560-573.
|
|
Chen Shu-ming,Liu Sheng,Wan Jiang-hua,et al.Coordinate multi-core DSP YHFT-QMBase:Architecture and implementation[J].SCIENTIA SINICA Informationis,2015,45(4):560-573.
|
[3] |
刘胜,卢凯,郭阳,等.一种自主设计的面向 E 级高性能计算的异构融合加速器[J].计算机研究与发展,2021,58(6):1234-1237.
|
|
Liu Sheng,Lu Kai,Guo Yang,et al.A self-designed heterogeneous accelerator for exascale high performance computing[J].Journal of Computer Research and Development,2021,58(6):1234-1237.
|
[4] |
Sanders J,Kandrot E.CUDA by example:An introduction to general-purpose GPU programming[M].New Jersey:Addison- Wesley Professional,2010.
|
[5] |
Chen L. Deep learning and practice with MindSpore[M].Beijing:Tsinghua University Press,2021.
|
[6] |
GNU binutils[DB/OL]. [2024-03-26].https://www.gnu.org/software/binutils/.
|
[7] |
Lattner C, Adve V.LLVM:A compilation framework for lifelong program analysis & transformation[C]∥Proc of the International Symposium on Code Generation and Optimization:Feedback-directed and Runtime Optimization,2004:75-86.
|
[8] |
Deng C,Chen Z,Shi Y,et al.Exploring ILP for VLIW architecture by quantified modeling and dynamic programming-based instruction scheduling[C]∥Proc of 2022 27th Asia and South Pacific Design Automation Conference,2022:256-261.
|
[9] |
陈照云,文梅,马奕民,等.面向GPDSP的轻量级高效汇编代码编程方法及系统:中国,ZL202111028130.2[P].2021-09-02.
|
[10] |
Zhao X L,Chen Z Y,Shi Y,et al.Automatic end-to-end joint optimization for kernel compilation on DSPs[C]∥Proc of 2023 60th ACM/IEEE Design Automation Conference,2023:1-6.
|
[11] |
Stallman R,Pesch R,Shebs S.Debugging with GDB[M].Boston:Free Software Foundation,1988.
|
[12] |
Bartholomew D. QEMU:A multihost,multitarget emulator[J].Linux Journal,2006,2006(145):1-3.
|
[13] |
Li Y B,Ding S,Zhang Q Y,et al.Debug information validation for optimized code[C]∥Proc of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation,2020:1052-1065.
|
[14] |
Gong S S,Altinbüken D,Fonseca P,et al.Snowboard:Finding kernel concurrency bugs through systematic inter-thread communication analysis[C]∥Proc of the ACM SIGOPS 28th Symposium on Operating Systems Principles,2021:66-83.
|
[15] |
Luna G A D, Italiano D,Massarelli L,et al.Who’s debugging the debuggers? Exposing debug information bugs in optimized binaries[C]∥Proc of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems,2021:1034-1045.
|
[16] |
Schulte E,Forrest S,Weimer W.Automated program repair through the evolution of assembly code[C]∥Proc of the IEEE/ACM International Conference on Automated Software Engineering,2010:313-316.
|
[17] |
Albawi S,Mohammed T A,Al-Zawi S.Understanding of a convolutional neural network[C]∥Proc of 2017 International Conference on Engineering and Technology,2017:1-6.
|
[18] |
Sherstinsky A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network[J].Physica D:Nonlinear Phenomena,2020,404:132306.
|
[19] |
Paszke A,Gross S,Massa F,et al.PyTorch:An imperative style,high-performance deep learning library[C]∥Proc of the 33rd International Conference on Neural Information Processing Systems,2019:8026-8037.
|
[20] |
马艳军,于佃海,吴甜,等.飞桨:源于产业实践的开源深度学习平台[J].数据与计算发展前沿,2019,1(1):105-115.
|
|
Ma Yan-jun,Yu Dian-hai,Wu Tian,et al.PaddlePaddle:An open-source deep learning platform from industrial practice[J].Frontiers of Data & Computing,2019,1(1):105-115.
|