• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2012, Vol. 34 ›› Issue (7): 54-59.

• 论文 • 上一篇    下一篇

多核处理器中混合分布式共享存储空间的实时划分技术

陈小文1,陈书明1,鲁中海2,Axel Jantsch2   

  1. (1.国防科学技术大学计算机学院,湖南 长沙 410073;
    2.瑞典皇家理工学院电子系统系,瑞典 斯德哥尔摩 16440)
  • 收稿日期:2010-07-05 修回日期:2010-10-24 出版日期:2012-07-25 发布日期:2012-07-25
  • 基金资助:

    国家863计划资助项目(2009AA011704);教育部“高性能微处理器技术”创新团队研究计划(IRT0614)

Runtime Partitioning Technique of Hybrid Distributed Shared Memory Space in Multicore Processors

CHEN Xiaowen1,CHEN Shuming1,LU Zhonghai2,Axel Jantsch2   

  1. 1.School of Computer Science,National University of Defense Technology,Changsha 410073,China;
    2.Department of Electronic Systems,KTHRoyal Institute of Technology,Stockholm 16440,Sweden)
  • Received:2010-07-05 Revised:2010-10-24 Online:2012-07-25 Published:2012-07-25

摘要:

在多核处理器芯片中,分布式共享存储DSM虽然提供了统一的全局寻址的存储空间,但却引入了虚地址向实地址转换的开销,这对性能产生了负面的影响。我们注意到,在并行程序的执行过程中,被处理的数据属性(私有或共享)并不是一成不变的。并行程序中不同的数据具有不同的属性,即使同一数据在程序的不同执行阶段也可能具有不同的属性。本文首先详细地阐述了一种混合式的分布式共享存储空间,支持对共享数据采用全局寻址的虚地址访问而对私有数据采用快速寻址的实地址访问;进而提出了一种针对混合式的分布式共享存储空间的实时划分技术。该技术根据并行程序中数据的属性,在程序运行时,实时地调整和划分分布式共享存储空间。当数据为私有时,通过实地址访问加快数据的访问速度,当数据为共享时则维持虚地址访问,从而减少整个并行程序运行过程中的地址转换开销,提高系统的性能。实际应用程序的实验结果表明,与传统的分布式共享存储空间相比,实时划分的混合式的分布式共享存储空间具有性能优势,性能的提升比例与具体的网络规模、计算规模、并行程序映射方式等有关。在我们的实验中,性能的提升比例最高为13.14%,最低为6.98%。

关键词: 地址转换, 数据属性, 实时划分, 分布式共享存储, 多核处理器

Abstract:

In multicore processors, Distributed Shared Memory (DSM) offers ease of programming by maintaining a global virtual memory space as well as imports the inherent overhead of translating virtual memory addresses into physical memory addresses, resulting in negative performance. We observe that, in parallel applications, different data have different properties (private or shared). Even for the same datum, its property may be changeable in different phases of the program execution. This paper firstly introduces a hybrid DSM, aiming at supporting fast and physical memory accesses for private data and maintaining a global and single virtual memory space for shared data. A runtime partitioning technique is proposed to change the hybrid DSM organization during the program execution. It ensures fast physical memory addressing on private data and conventional virtual memory addressing on shared data, improving the performance of the entire system by reducing virtualtophysical address translation overhead as much as possible. The experimental results show that the hybrid DSM with runtime partitioning demonstrates performance advantage over the conventional DSM counterpart. The percentage of performance improvement depends on network size, problem size, way of data partitioning, etc. In our experiments, the maximal improvement is 13.14%, and the minimal improvement 6.98%.

Key words: address translation;data property;runtime partitioning;distributed shared memory;multicore processor