• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (09): 1539-1546.

• High Performance Computing • Previous Articles     Next Articles

An urgent memory reuse algorithm for computational graphs

CAO Bo-jun,QIAN Ru-yi,XU Yuan-chao   

  1. (College of Information Engineering,Capital Normal University,Beijing 100048,China)
  • Received:2024-02-05 Revised:2024-04-05 Accepted:2024-09-25 Online:2024-09-25 Published:2024-09-19

Abstract: The limited device memory capacity restricts the further expansion of deep neural network models, and memory reuse is one of the few methods to save memory usage without introducing additional overhead. Intermediate tensors in the computational graph occupy the majority of memory space and are the primary optimization targets for memory reuse algorithms. Existing typical memory reuse algorithms, including the large tensor priority algorithm and the short lifetime priority algorithm, only consider a single characteristic, focusing solely on whether the lifetimes of tensors overlap, while ignoring the relative positional relationship between the lifetimes of adjacent tensors. As the computational graph becomes more complex, the exploitation of memory reuse becomes less sufficient. To address this issue, a new memory reuse algorithm, UMR, is proposed. By deeply analyzing the relative positional relationship between the lifetimes of adjacent tensors in the graph and promptly reusing them, UMR obtains more opportunities for memory reuse. The algorithm is evaluated based on real inference models in MLPerf, and the results show that the memory reuse rate of the UMR algorithm is not lower than that of existing mainstream algorithms and can achieve the theoretical optimum for memory reuse in the model. Evaluations of the algorithm based on relatively complex computational graphs indicate that compared to the large tensor priority and short lifetime priority algorithms, UMR saves up to 21.6% and 18.7% of memory usage, with average savings of 6.5% and 13.2%, respectively.

Key words: computational graph, memory optimization, memory reuse, memory usage rate