• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

 A K-modes clustering algorithm
based on Bayes distance measure
 

ZHAO Liang1,LIU Jianhui2,ZHANG Zhaozhao2   

  1. (1.Institute of Graduate,Liaoning Technical University,Fuxin 123000;
    2.School of Electronic and Information Engineering,Liaoning Technical University,Huludao 125000,China)
  • Received:2015-06-05 Revised:2015-11-27 Online:2017-01-25 Published:2017-01-25

Abstract:

The original distance measure of Kmodes clustering algorithm cannot reflect the difference between categorical variables. To overcome this drawback, we propose a new distance measure algorithm based on the intermediate result of Nave Bayes classifier. This algorithm constructs feature vectors to present categorical variables and uses the Euclidean distance of the feature vectors as distance between variables. We implement the Kmodes algorithm with the new derived measure and the experiments on extensive UCI data sets show that the proposal is more effective in comparison with other measure algorithms.

Key words: K-modes clustering algorithm, categorical variables, Nave Bayes classifier, distance measure