延时敏感的推测多线程调度策略

J4 ›› 2013, Vol. 35 ›› Issue (11): 14-21.

延时敏感的推测多线程调度策略

李艳华，张悠慧，王为，郑纬民

(清华大学计算机系,北京 100084)

收稿日期:2013-07-10 修回日期:2013-09-25 出版日期:2013-11-25 发布日期:2013-11-25
基金资助:
国家863计划资助项目(2013AA01A215)；教育部-Intel 信息技术专项科研基金资助项目(MOE-INTEL-11-04)

Latencyaware thread scheduling
scheme for threadlevel speculation

LI Yanhua,ZHANG Youhui,WANG Wei,ZHENG Weimin

(Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)

Received:2013-07-10 Revised:2013-09-25 Online:2013-11-25 Published:2013-11-25

摘要/Abstract

摘要：

随着大规模片上多核处理器的发展，越来越多的核被集成到一个芯片上。一方面，总会有一些核处于空闲状态；另一方面，受功耗限制片上单核比较简单，导致单线程性能较弱。通过在片上多核处理器上支持推测多线程机制，可以利用空闲的片上资源来加速串行程序执行，提高单线程性能。决定推测多线程执行性能的一些额外开销，比如缓存缺失率上升、冲突检测开销、线程提交开销以及推测线程重新执行开销等，对片上多核处理器访存时延和核间通信时延非常敏感。传统的多线程调度算法因为没有考虑到推测多线程机制的特点，在用于推测多线程调度时效果不佳。提出的延时敏感的推测多线程调度算法，利用推测多线程在剖析、编译阶段产生的访存特性统计和实时访存记录，计算程序的数据重心,逐步将推测多线程调度到数据重心周围的相邻几个核中执行；同时，在推测线程调度过程中充分利用提交成功的线程和推测失败的线程留在缓存中的数据，提高缓存利用率。实验结果表明，推测多线程机制执行中，采用延时敏感的推测多线程调度策略相对于广泛采用的优先级调度策略能够取得平均16.8%的性能提升；相对于最近提出的基于非一致性数据访问优化的集群线程调度策略能够取得平均10.1%的性能提升。

关键词: 时延, 片上多核处理器, 推测多线程, 线程调度

Abstract:

With the advent of largescale chipmultiprocessors (CMPs), more and more cores are integrated on a single chip. On the first hand, there always will be some idle cores. And on the other hand, with the energy consumption limit, cores integrated on the chip are relatively simple. ThreadLevel Speculation (TLS) remains a promising technique for exploiting the idle hardware resources to improve the performance of a sequential program. However, the usual distributed design of largescale CMPs, like the nonuniform cache architecture (NUCA), introduces some nonuniform architectureproperties which significantly increase the overhead of TLS execution (L2 cache access overhead, task squashing overhead and reexecution overhead). Some stateoftheart multithread scheduling algorithms work poorly for TLS because of ignoring these TLSrelative characteristics. The proposed latencyaware thread scheduling algorithm for threadlevel speculation, uses the memory access statistics gained in the profiling, compiling and realtime executing stages, to calculate the CDG (Center of Data Gravity) of the program, and then schedules the speculative threads to the cores around the CDG. At the same time, the proposed thread scheduling algorithm makes good use of the data remained in the cache by the committed and squashed threads. Evaluation results show that latencyaware thread scheduling algorithm observed 16.8% performance speedup over priority scheduling, and 10.1% performance speedup over clusteredthread scheduling.

Key words: Latency；chip multiprocessors；threadlevel speculation；thread scheduling

李艳华，张悠慧，王为，郑纬民. 延时敏感的推测多线程调度策略[J]. J4, 2013, 35(11): 14-21.

LI Yanhua,ZHANG Youhui,WANG Wei,ZHENG Weimin. Latencyaware thread scheduling
scheme for threadlevel speculation [J]. J4, 2013, 35(11): 14-21.

编辑推荐

Metrics

阅读次数

全文

239

HTML			PDF

最新录用	在线预览	正式出版	最新录用	在线预览	正式出版
0	0	0	0	0	239

来源	本网站	其他网站

次数	173	66
比例	72%	28%

摘要

127

最新录用	在线预览	正式出版

0	0	127

	来源	本网站

	次数	127
	比例	100%

[1]	黄晓鹏1，黄传河1，农黄武1，杨丹凤1，杨金羚2. SDN中的端到端时延[J]. J4, 20160101, 38(01): 67-72.
[2]	陆平静, 余佳仁, 袁郭苑. Gyration:基于RTT测量的报文偏转拥塞控制算法[J]. 计算机工程与科学, 2024, 46(11): 1908-1915.
[3]	朱新峰1,张智浩1，王彦凌2. 移动边缘计算环境下的动态资源分配策略[J]. 计算机工程与科学, 2019, 41(07): 1184-1190.
[4]	葛晓瑜1，沈国华1,2，黄志球1,2，邓刘梦1，宛伟健1. 一种基于失效传播模型的危害分析方法[J]. 计算机工程与科学, 2019, 41(06): 1026-1033.
[5]	陈庭平，虞万荣，吴纯青. TCP传输中往返时延偏移智能响应机制研究[J]. 计算机工程与科学, 2016, 38(08): 1647-1653.
[6]	杨亚琪，栾钟治，杨海龙，杨姝，钱德沛. 异构多核下兼顾应用公平性和能耗的调度方法研究[J]. J4, 2016, 38(05): 848-856.
[7]	黄晓鹏1，黄传河1，农黄武1，杨丹凤1，杨金羚2. SDN中的端到端时延[J]. J4, 2016, 38(01): 67-72.
[8]	方飞1,2，毛玉明1. OFDM符号特性对DCF性能影响研究[J]. J4, 2015, 37(03): 471-478.
[9]	李善玺, 马强, 陈文波. 高带宽组播网络端到端时延测量及分析[J]. 计算机工程与科学, 2014, 36(09): 1684-1689.
[10]	谭海. 基于KILL规则的双通道3D众核NoC原型构建[J]. J4, 2013, 35(7): 11-15.
[11]	刘小群. 时延细胞神经网络的全局渐近稳定性[J]. J4, 2013, 35(7): 82-86.
[12]	崔建群1,叶咏佳1,高宽1,范静2,吴黎兵2. 移动应用层组播中基于父节点备份机制的快速重定向策略研究[J]. J4, 2013, 35(12): 39-44.
[13]	殷齐鹏，吴纯青，虞万荣，赵宝康，马延鹏. 异构混合网络环境下TCP协议性能分析[J]. J4, 2013, 35(12): 52-57.
[14]	陈彦德1,宋留斌1,2,黎刚1,张练1,谭理1. 一种优化时延的机会协同组播调度策略[J]. J4, 2012, 34(4): 12-16.
[15]	高红亮1，2，汪秉文1，高超1，胡晓娅1. 无线传感器网络QoS仿真与研究[J]. J4, 2012, 34(11): 7-13.