• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (11): 14-21.

• 论文 • Previous Articles     Next Articles

Latencyaware thread scheduling
scheme for threadlevel speculation  

LI Yanhua,ZHANG Youhui,WANG Wei,ZHENG Weimin   

  1. (Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)
  • Received:2013-07-10 Revised:2013-09-25 Online:2013-11-25 Published:2013-11-25

Abstract:

With the advent of largescale chipmultiprocessors (CMPs), more and more cores are integrated on a single chip. On the first hand, there always will be some idle cores. And on the other hand, with the energy consumption limit, cores integrated on the chip are relatively simple. ThreadLevel Speculation (TLS) remains a promising technique for exploiting the idle hardware resources to improve the performance of a sequential program. However, the usual distributed design of largescale CMPs, like the nonuniform cache architecture (NUCA), introduces some nonuniform architectureproperties which significantly increase the overhead of TLS execution (L2 cache access overhead, task squashing overhead and reexecution overhead). Some stateoftheart multithread scheduling algorithms work poorly for TLS because of ignoring these TLSrelative characteristics. The proposed latencyaware thread scheduling algorithm for threadlevel speculation, uses the memory access statistics gained in the profiling, compiling and realtime executing stages, to calculate the CDG (Center of Data Gravity) of the program, and then schedules the speculative threads to the cores around the CDG. At the same time, the proposed thread scheduling algorithm makes good use of the data remained in the cache by the committed and squashed threads. Evaluation results show that latencyaware thread scheduling algorithm observed 16.8% performance speedup over priority scheduling, and 10.1% performance speedup over clusteredthread scheduling.

Key words: Latency;chip multiprocessors;threadlevel speculation;thread scheduling