• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Distributed SVM parameter optimization based on Hadoop

WU Yun-wei,NING Qian   

  1. (College of Electronics and Information Engineering,Sichuan University,Chengdu  610065,China)
  • Received:2016-01-30 Revised:2016-06-07 Online:2017-06-25 Published:2017-06-25

Abstract:

The classification and prediction accuracy of an algorithm are directly influenced by the choice of parameters, and among the methods of parameter selection, global grid search has obvious advantages, such as reliable and simple calculation, and obvious optimization effect, which are suitable for engineering operations that have high reliability requirement, for instance, parameter optimization of the fault pattern recognition algorithm in fault diagnosis of system. However, the global grid search is time-consuming in the search process, therefore there is still a constraint on use, especially for the system which has high real-time requirement. Using the global parameter optimization of support vector machine as a case, Hadoop platform is used for distributed parameter optimization in order to overcome the disadvantage of grid search. With HDFS, the parameters can be automatically divided into calculation nodes. We establish the distributed parameter optimization model by using the MapReduce computing framework, then conduct model training and prediction as well as parameter optimization. Experimental results show that the optimization efficiency is improved without reducing algorithm performance.
 

Key words: Hadoop, MapReduce, support vector machine, grid search, parameter optimization, distributed computing