Computer Engineering & Science
Previous Articles Next Articles
WEN Jing,CAO Yan,ZHANG Lin,MU Xiang-wei
Received:
Revised:
Online:
Published:
Abstract:
There are two major factors that affect the k-means clustering effect: the number of clustering and the initial choice of the centroids. We put forward an improved k-means algorithm based on the double genetic algorithm, which uses the outer sub-genetic algorithm to control the number of clustering, and the inner sub-genetic algorithm to control the initial choice of cluster centroids, and utilizes the intra-class distance and inter-lass distance as well as the ratio between them to evaluate the clustering results. We therefore can get both the optimal number of clustering and the corresponding optimal initial cluster centroids by this improved k-means method. In addition, given the specificity of the inner and outer sub-generic algorithms, the improved k-means algorithm uses two different encoding strategies, and in order to preserve excellent individuals, it also uses the elite individuals reserved strategy. Experiments on the UCI data set verify the effectiveness of the improved k-means algorithm and it has a reference value for data mining.
Key words: double genetic, cluster analysis, k-means algorithm, layered coding, elitism preservation
WEN Jing,CAO Yan,ZHANG Lin,MU Xiang-wei.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2017/V39/I12/2320