Cache management and  memory scheduling
 based on inter-warp heterogeneity

Computer Engineering & Science

Previous Articles Next Articles

Cache management and memory scheduling

based on inter-warp heterogeneity

FANG Juan,WEI Zelin,YU Tingwen

(College of Computer Science,Faculty of Information Science,Beijing University of Technology,Beijing 100124,China)

Received:2018-10-08 Revised:2018-12-12 Online:2019-05-25

Abstract

Abstract:

All threads within a warp execute the same instruction in the lockstep in a GPU. Memory requests from some threads are served early while requests from some other threads have to experience long time latency. Warp cannot execute the next instruction before the last request is served, which can cause memory divergence. We study the inter-warp heterogeneity in GPU, implement and optimize a cache management mechanism and a memory scheduling policy based on interwarp heterogeneity, which can reduce the negative impact of memory divergence and cache queuing latency. Warps are classified according to the hit rate of L2 cache to drive the following three components: (1) A warp-type based cache bypassing mechanism to bypass the L2 cache for warps with low cache utilization; (2) A warptype based cache insert/improvement policy to prevent the data from warps with high cache utilization being cleared prematurely; and (3) A warp-type based memory scheduler to prioritize requests received from warps with high cache utilization and the requests from the same warp. Compared with the baseline GPU, the cache management mechanism and the memory scheduling policy based on inter-warp heterogeneity can speed up 8 different GPGPU applications by 18.0% on average.

Key words: cache management, memory scheduling, memory divergence, warp

FANG Juan,WEI Zelin,YU Tingwen.

Cache management and memory scheduling

based on inter-warp heterogeneity

[J]. Computer Engineering & Science.

[1]	ZHOU Sheng-ru, CHEN Zhi-gang, DENG Yi-qin. A tennis action recognition and evaluation method based on PoseC3D [J]. Computer Engineering & Science, 2023, 45(01): 95-103.
[2]	SHEN Li,YANG Yao-hua,WANG Zhi-ying. Eliminating control divergence on GPGPU via partial warp regrouping [J]. Computer Engineering & Science, 2019, 41(08): 1335-1342.
[3]	LIU Zi-jun1，HE Yan-xiang1,2，ZHANG Jun1,3，LI Qing-an1,2，SHEN Fan-fan1. Behavior-aware memory scheduling for GPGPU applications [J]. Computer Engineering & Science, 2017, 39(06): 1011-1021.
[4]	MIAO Minmin,ZHOU Zhiping,WANG Jiefeng. Mobile phone user authentication scheme based on embedded accelerometer [J]. J4, 2015, 37(03): 508-513.
[5]	HE Changpeng,HOU Jin,WANG Xian. Research of the threedimensional virtual human motion synthesis method based on skeleton [J]. J4, 2014, 36(04): 737-740.
[6]	褚瑞谢健聪肖侬卢锡城. [J]. J4, 2008, 30(8): 111-115.

Cache management and memory scheduling

based on inter-warp heterogeneity

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 6

Recommended Articles

Metrics

Comments