以访存为中心的阵列众核处理器核心流水线设计

计算机工程与科学

以访存为中心的阵列众核处理器核心流水线设计

张昆,郑方,谢向辉

（数学工程与先进计算国家重点实验室，江苏无锡 214125）

收稿日期:2016-09-27 修回日期:2016-12-08 出版日期:2017-12-25 发布日期:2017-12-25
基金资助:
国家863计划（2015AA01A301）；国家自然科学基金(91430214)

A load-centric core pipeline design in

array many-core processors

ZHANG Kun，ZHENG Fang,XIE Xiang-hui

（State Key Laboratory of Mathematical Engineering and Advanced Computing,Wuxi 214125,China）

Received:2016-09-27 Revised:2016-12-08 Online:2017-12-25 Published:2017-12-25

摘要/Abstract

摘要：

传统的流水线设计是以转移指令为中心的，大量逻辑资源被用于提高处理器转移预测的能力，以保证向流水线发射和执行部件提供充足的指令流。在阵列众核处理器中提出了一种以访存为中心的核心流水线设计。通过提高访存装载指令在流水线中的执行优先级，以及访存装载指令的预测执行机制，可以有效减少顺序流水线因访存延迟所带来的停顿，提高流水线性能和能效比。测试结果表明，以4 KB容量的装载指令访存地址表为例，访存为中心的流水线设计可以带来8.6%的流水线性能提升和7%的流水线能效比提高。

关键词: 众核处理器, 核心流水线, 访存优化, 阵列众核

Abstract:

Traditional processor pipeline is a branch-instruction-centric design where a large number of chip resources are used to improve the prediction accuracy of branches. We present a load-centric core pipeline design in array many-core processors. In the load-centric pipeline, the load instruction has higher priority to be issued and executed. Besides, we also propose a prediction mechanism to generate the load instruction’s source address in advance. The load-centric design decreases the stall latency of load instructions and therefore improves the pipeline’s performance and energy efficiency. Experimental results show that equipped with a 4KB size prediction table, the load-centric design can improve the pipeline performance and energy efficiency by 8.6% and 7% respectively

Key words: many-core processor, core pipeline, optimization of memory accesses, array many-core processors

张昆,郑方,谢向辉. 以访存为中心的阵列众核处理器核心流水线设计[J]. 计算机工程与科学.

ZHANG Kun，ZHENG Fang,XIE Xiang-hui.

A load-centric core pipeline design in

array many-core processors

[J]. Computer Engineering & Science.

编辑推荐

Metrics

阅读次数

全文

326

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	326	0	0

来源	本网站	其他网站

次数	252	74
比例	77%	23%

摘要

117

最新录用	在线预览	正式出版

117	0	0

	来源	本网站

	次数	117
	比例	100%

[1]	王鑫, 彭健. 基于HYB格式SpMV在新一代申威架构上的实现与优化[J]. 计算机工程与科学, 2023, 45(10): 1754-1762.
[2]	王武1，王舒扬1,2，姜金荣1,孟虹松3. 快速多极子方法在申威众核处理器上的实现和优化[J]. 计算机工程与科学, 2019, 41(07): 1161-1167.
[3]	陈逸飞，朱蕾，李宏亮. 一种多线程阵列众核处理器的二级Cache划分机制[J]. 计算机工程与科学, 2019, 41(03): 400-408.
[4]	陈逸飞，李宏亮，刘骁，高红光. 一种阵列众核处理器的多级指令缓存结构[J]. 计算机工程与科学, 2018, 40(04): 571-579.
[5]	张昆,刘骁,郑方,谢向辉. 众核处理器的共享一级指令缓存研究[J]. 计算机工程与科学, 2017, 39(05): 834-840.
[6]	万虎1,徐远超1,2,孙凤芸1,闫俊峰1. 面向大数据应用的众核处理器缓存结构设计[J]. J4, 2015, 37(01): 28-35.
[7]	刘勇,陆林生,何王全. 一种简便的栈式片上内存动态管理方法[J]. J4, 2010, 32(9): 111-114.