• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (9): 1700-1710.

• Artificial Intelligence and Data Mining • Previous Articles    

Optimization and reduction for deep learning test set based on MMD-GA

WANG Fengying1,2,SONG Zikai2,ZHANG Yan1,DU Liming1   

  1. (1.School of Information Engineering,Suqian University,Suqian 223800;
    2.School of Computer Science and Engineering,Shenyang Jianzhu University,Shenyang 110168,China)
  • Received:2023-11-21 Revised:2024-05-09 Online:2025-09-25 Published:2025-09-22

Abstract: In the field of image recognition, test cases are redundant and labeling still requires manual operation. Optimizing test cases is an effective way to solve the problems of high testing costs and low testing efficiency. Based on this, a test case optimization and reduction method based on evolutionary algorithm, named ERIR, is proposed. It uses a deep neural network model to extract image features, which are then substituted into the HDBSCAN clustering algorithm  to analyze the data distribution of the original test set. On the basis of clustering results, an evolutionary algorithm is designed with the goal of minimizing the difference between the test subset and the original distribution. A test case selection method combining maximum mean discrepancy and genetic algorithm, named MMD-GA, is proposed, which can select the most representative prototypes from each cluster to form a test subset. A large number of experiments were carried out on CNN structure and Transformer-structure models using this algorithm. The results show that the selected test inputs improve time efficiency while ensuring that the accuracy is close to that of the original test set, with the average error of accuracy compared with the overall test set ranging from 0.18% to 2.32%.

Key words: test case set reduction, deep learning, image recognition, genetic algorithm, software test- ing