• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Impact of Linux kernel parameters on Spark workloads

WANG  Li1,2,WANG Jing1,2,ZHANG Wei-gong2,3,QIU Ke-ni2,3,LU Ke-zhong4   

  1. (1.Beijing Advanced Innovation Center for Imaging Technology,Capital Normal University,Beijing 100048;
    2.College of Information Engineering,Capital Normal University,Beijing 100048;
    3.Beijing Engineering Research Center of High Reliable Embedded System,Capital Normal University,Beijing 100048;
    4.College of Computer Science & Software Engineering,Shenzhen University,Shenzhen 518060,China)
  • Received:2017-01-03 Revised:2017-03-02 Online:2017-07-25 Published:2017-07-25

Abstract:

Research on the performance of Spark becomes a hot topic, however, optimization strategies are mostly used on the application level instead of system level. As the first software above hardware, the operating system plays a fundamental role in the performance of hardware. The Linux kernel provides abundant parameters as the interface to optimize the performance of the system. However, in practice, kernel parameters have not fully played their roles. Most people use their default values rather than change them to fit the specific environment. However, our experiments prove that the default values are not always the best choice, and sometimes it is even the worst. We define the concept of "influence ratio", and put forward a method based on the concept to understand the influence of parameters on Spark applications by analyzing the kernel functions. According to the features of the memory computing of Spark, we analyze the influence of Linux kernel parameters on several typical Spark workloads from the aspects of Transparent Huge Page and NUMA, which closely relates to the use of memory, and then give some conclusions. We hope that the analysis and conclusions can provide some experience of tuning kernel parameters reasonably for the Spark platform.

Key words: big data, spark, Linux, huge page, NUMA