A method for constructing performance analysis model of high performance application based on random forest classifier

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (07): 1218-1228.

• High Performance Computing • Previous Articles Next Articles

A method for constructing performance analysis model of high performance application based on random forest classifier

CHAI Xu-qing1,2,3,QIAO Yi-hang1,2,3,FAN Li-lin1,2,3

(1.College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007；
2.High Performance Computing Center,Henan Normal University,Xinxiang 453007；
3.Henan Engineering Laboratory of Intelligent Commerce and Internet of Things Technology,Xinxiang 453007,China)

Received:2023-11-03 Revised:2023-12-22 Accepted:2024-07-25 Online:2024-07-25 Published:2024-07-19

Abstract

Abstract: Traditional performance analysis methods for high performance applications have shortcomings such as additional overhead during the analysis process and inaccurate analysis results, resulting in users spending more time and domain knowledge. To address these issues, this paper transforms the problem of program performance analysis into a multi-classification problem of unbalanced small sample datasets under high-dimensional features. By collecting 500 pieces of performance data that include seven types of metrics such as the number of process switches, memory utilization, and disk I/O load during program runtime, after data preprocessing such as PCA dimensionality reduction, a program performance problem analysis model is trained using a random forest classifier. Experimental validation shows that the model can identify five types of performance issues, including excessive memory utilization and heavy disk I/O load. To evaluate the effectiveness of the models guidance, this paper collects performance data generated by the HotSpot3D program and the LU-Decomposition program during runtime. Based on the models output guidance, the two validation programs are optimized at the runtime level and the compilation level. Experimental results indicate that the proposed method can effectively guide the optimization of program performance, with speedup ratios of 1.056 and 5.657 for the two programs, respectively.

Key words: Nmon, performance analysis, variational autoencoder, cluster, random forest

CHAI Xu-qing, QIAO Yi-hang, FAN Li-lin, . A method for constructing performance analysis model of high performance application based on random forest classifier[J]. Computer Engineering & Science, 2024, 46(07): 1218-1228.

[1]	SONG Xin-hai, HAN Jing-yu, LANG Hang, MAO Yi. A sliding window voting strategy based on hidden Markov model for morphology detection of QRS complex [J]. Computer Engineering & Science, 2024, 46(02): 272-281.
[2]	ZHONG Zhuo-hui, CHEN Li-fei, . A model-based non-convex clustering algorithm [J]. Computer Engineering & Science, 2024, 46(02): 292-302.
[3]	XIAO Zhen-guo, CHEN Lin-shu, SUN Shao-jie, MEI Ben-xia, LIU Yuan-hui, ZHAO Lei. A clustering method based on algebraic granularity [J]. Computer Engineering & Science, 2024, 46(01): 150-158.
[4]	ZHANG Tian-yang, CHI Cheng-yue, GUO Wu, GAO Yi-qin, WEN Min-hua, WEI Jian-wen . Key techniques and practice on managing multi-site HPC clusters for university campus [J]. Computer Engineering & Science, 2023, 45(12): 2135-2145.
[5]	WANG Ruo-bin, GENG Fang-dong, ZHANG Yong-mei, SONG Wei, WANG Wei-feng, XU Lin. Blended MOOC video viewing pattern mining based on an improved self-adaptive DBSCAN [J]. Computer Engineering & Science, 2023, 45(09): 1670-1678.
[6]	LIU Yang, SU Hang, HE Qian, SHEN Pu, LIU Peng. An equipment fault detection method based on cloud-edge collaboration variational autoencoder neural network [J]. Computer Engineering & Science, 2023, 45(07): 1188-1196.
[7]	ZHANG Li, LI Tie-jun, ZHANG Jian-min. A 128-core scalable architecture for Monte Carlo application [J]. Computer Engineering & Science, 2023, 45(04): 590-598.
[8]	HU Yan-fang, XIONG Wen, GAO Wei. An online game user churn prediction method based on Spark platform [J]. Computer Engineering & Science, 2022, 44(10): 1730-1737.
[9]	LI Shuai, CHANG Jin-cai, LI-L Mu-zhi, CAI Kun-jie, . A Stacking ensemble clustering algorithm based on differential privacy protection [J]. Computer Engineering & Science, 2022, 44(08): 1402-1408.
[10]	LI Lan, LIU Jie, ZHANG Jie. A complex pedestrian detection model based on improved YOLOv4 algorithm [J]. Computer Engineering & Science, 2022, 44(08): 1449-1456.
[11]	CHEN Feng-xian. Cluster job runtime prediction based on NR-Transformer [J]. Computer Engineering & Science, 2022, 44(07): 1181-1190.
[12]	PANG Xing-long, ZHU Guo-sheng, YANG Shao-long, LI Xiu-yuan. A network traffic classification method based on clustering and noise [J]. Computer Engineering & Science, 2022, 44(07): 1207-1215.
[13]	HUANG Zhi-qiang, LI Jun, ZHANG Shi-yi. Object detection research based on lightweight neural network [J]. Computer Engineering & Science, 2022, 44(07): 1265-1272.
[14]	LIU Rong, WU Xin, AO Bin, WEN Qing, LI Kuan. Cell annotation refinement and adaptive weighted loss for CD56 image segmentation [J]. Computer Engineering & Science, 2022, 44(05): 870-878.
[15]	LIU Yun, XIAO Tian, WANG Zi-yu. Optimization of dynamic feature selection algorithm for malicious behavior detection [J]. Computer Engineering & Science, 2022, 44(04): 665-673.

A method for constructing performance analysis model of high performance application based on random forest classifier

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 15

Recommended Articles

Metrics

Comments