• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Efficient and transparent CPU-GPU data
communication through partial page migration

#br# ZHANG Shiqing,YANG Yaohua,SHEN Li,WANG Zhiying   

  1. (School of Computer,National University of Defense Technology,Changsha 410073,China)
  • Received:2018-10-19 Revised:2018-12-11 Online:2019-07-25 Published:2019-07-25

Abstract:

Despite the increasing investment in integrated GPUs and nextgeneration interconnect research, discrete GPUs connected by PCI Express still dominate the market, and the management of data communication between CPUs and GPUs continues to evolve. Initially, the programmers control the data transfer between CPUs and GPUs explicitly. To simplify programming, GPU vendors have developed a programming model to provide a single virtual address space for “CPU + GPU” heterogeneous systems. The page migration engine in this model transfers pages between CPUs and GPUs on demand automatically. To meet the needs of high-performance workloads, the page size tends to be larger. Limited by low bandwidth and high latency interconnections, larger page migration has longer delay, which can reduce the overlap of computation and transmission and cause severe performance degradation. We propose a partial page migration mechanism that only transfers the requested part of a page to shorten the migration latency and avoid performance degradation of the whole page migration when the page becomes larger. Experiments show that the proposed partial page migration can well hide the performance overheads of the whole page migration when the page size is 2MB and the PCI Express bandwidth is 16GB/sec. Compared with data transmission controlled by the programmers, the whole page migration degrades the performance by 98.62 on average, while the partial page migration upgrades the performance by 1.29 on average. Additionally, we examine the impact of page size on TLB miss rate and the impact of migration unit size on execution time, enabling designers to make informed decisions based on this information.
 

Key words: heterogeneous “CPU + GPU&rdquo, system;data communication;page migration