J4 ›› 2013, Vol. 35 ›› Issue (11): 14-21.
• 论文 • Previous Articles Next Articles
LI Yanhua,ZHANG Youhui,WANG Wei,ZHENG Weimin
Received:
Revised:
Online:
Published:
Abstract:
With the advent of largescale chipmultiprocessors (CMPs), more and more cores are integrated on a single chip. On the first hand, there always will be some idle cores. And on the other hand, with the energy consumption limit, cores integrated on the chip are relatively simple. ThreadLevel Speculation (TLS) remains a promising technique for exploiting the idle hardware resources to improve the performance of a sequential program. However, the usual distributed design of largescale CMPs, like the nonuniform cache architecture (NUCA), introduces some nonuniform architectureproperties which significantly increase the overhead of TLS execution (L2 cache access overhead, task squashing overhead and reexecution overhead). Some stateoftheart multithread scheduling algorithms work poorly for TLS because of ignoring these TLSrelative characteristics. The proposed latencyaware thread scheduling algorithm for threadlevel speculation, uses the memory access statistics gained in the profiling, compiling and realtime executing stages, to calculate the CDG (Center of Data Gravity) of the program, and then schedules the speculative threads to the cores around the CDG. At the same time, the proposed thread scheduling algorithm makes good use of the data remained in the cache by the committed and squashed threads. Evaluation results show that latencyaware thread scheduling algorithm observed 16.8% performance speedup over priority scheduling, and 10.1% performance speedup over clusteredthread scheduling.
Key words: Latency;chip multiprocessors;threadlevel speculation;thread scheduling
LI Yanhua,ZHANG Youhui,WANG Wei,ZHENG Weimin. Latencyaware thread scheduling scheme for threadlevel speculation [J]. J4, 2013, 35(11): 14-21.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2013/V35/I11/14