• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (05): 1009-1014.

• 论文 • Previous Articles     Next Articles

A  fuzzy K-prototypes clustering algorithm
based on information gain  

OUYANG Hao1,WANG Zhiwen1,DAI Xisheng2,LIU Zhiqi1   

  1. (1.School of Computer,Guangxi University of Science and Technology,Liuzhou 545006;
    2.School of Electrical and Information Engineering,
    Guangxi University of Science and Technology,Liuzhou 545006,China)
  • Received:2014-09-10 Revised:2014-11-04 Online:2015-05-25 Published:2015-05-25

Abstract:

K-prototypes clustering algorithms  combine K-means and K-modes to analyze mixed data objects.Classic K-prototypes clustering algorithms don’t consider the effect degree of each attribute to the last clustering results when calculating the dissimilarity of data object. But in the real world,the importance of each attribute varies.In this paper we use information gain of the information theory to get the weight of each attribute.These weights are used to get a better clustering result when we calculate the dissimilarity.In order to improve the fuzzy ability,the proposed algorithm exploits the fuzzy theory to get a better capability for dealing with anti-noise  and uncertain problems. Clustering experiments on four UCI data sets validate the effectiveness of our algorithm.

Key words: clustering;information gain;fuzzy K-prototypes;mixed data