• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (01): 28-35.

• 论文 • 上一篇    下一篇

面向大数据应用的众核处理器缓存结构设计

万虎1,徐远超1,2,孙凤芸1,闫俊峰1   

  1. (1.首都师范大学信息工程学院,北京 100048;2.中国科学院计算技术研究所计算机体系结构国家重点实验室,北京 100190)
  • 收稿日期:2014-05-17 修回日期:2014-10-20 出版日期:2015-01-25 发布日期:2015-01-25
  • 基金资助:

    北京市自然科学基金资助项目(4143060);北京市教委科技发展面上资助项目(KM201210028004);计算机体系结构国家重点实验室开放课题(CARCH201203);“高可靠嵌入式系统技术”北京市工程研究中心;北京市属高等学校人才强教项目国外访学(135300100)

Cache structure design for big data oriented many-core processor  

WAN Hu1,XU Yuanchao1,2,SUN Fengyun1,YAN Junfeng1   

  1. (1.College of Information Engineering,Capital Normal University,Beijing 100048;2.State Key Laboratory of Computer Architecture,Institute of Computing Technology,
    Chinese Academy of Sciences, Beijing 100190,China)
  • Received:2014-05-17 Revised:2014-10-20 Online:2015-01-25 Published:2015-01-25

摘要:

大规模数据排序、搜索引擎、流媒体等大数据应用在面向延迟的多核/众核处理器上运行时资源利用率低下,一级缓存命中率高,二级/三级缓存命中率低,LLC容量的增加对IPC的提升并不明显。针对缓存资源利用率低的问题,分析了大数据应用的访存行为特点,提出了针对大数据应用的两种众核处理器缓存结构设计方案,两种结构均只有一级缓存,Share结构为完全共享缓存,Partition结构为部分共享缓存。评估结果表明,两种方案在访存延迟增加不多的前提下能大幅节省芯片面积,其中缓存容量较低时,Partition结构优于Share结构,缓存容量较高时,Share结构要逐渐优于Partition结构。由于众核处理器中分配到每个处理器核的容量有限,因此Partition结构有一定的优势。

关键词: 众核处理器, 大数据应用, 缓存设计, 访存行为, 数据中心

Abstract:

Some big data applications such as data sorting, search engine, streaming media running on the traditional latencyoriented multi/manycore processor are inefficiency. The hit rate of L1 Cache is high while that of L2/L3 Cache is relative low and IPC is not sensitive to LLC capacity. To address the low utilization issue of cache resources, we analyze the memory access patterns of big data applications, and then propose an optimization method of cache structure for manycore processor. Both the two structures  only have L1 cache, while one is fully shared cache structure, and the other is partly shared cache partition structure. The evaluation results show that these two schemes can significantly save chip area at the cost of slightly increase of memory access. When cache capacity is low, the partition structure is superior to the share structure. As cache capacity increases, the share structure will gradually become superior to the partition structure. For manycore processors, the capacity assigned to each processor is limited, thus the partition structure has certain advantages.

Key words: many-core processor;big data application;cache design;memory access behavior;data center