• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2013, Vol. 35 ›› Issue (2): 127-132.

• 论文 • Previous Articles     Next Articles

Dual clustering method of mixed data set

Chen Xinquan     

  1. (School of Computer Science and Engineering,Chongqing Three Gorges University,Chongqing 404000,China)
  • Received:2012-05-19 Revised:2012-08-23 Online:2013-02-25 Published:2013-02-25

Abstract:

In order to effectively preprocessing mixed data sets from complex information environment, this paper proposes a dual clustering method. This dual clustering method is implemented by a construction algorithm of a dual near neighbor undirected graph or its improved algorithm, a clustering algorithm based on merging disjointset, a clustering algorithm based on breadthfirstsearch, or a clustering algorithm based on depthfirstsearch. Through the simulation experiments of some artificial data sets and UCI standard data sets, we can verify that the three clustering algorithms can obtain the same results in the end, although they use different search strategies. The experimental results also show that this dual clustering method can often obtain better clustering quality than kmeans algorithm and AP algorithm when handling some data sets with apparent clusters and without near neighbors noises. This demonstrates the dual clustering method is comparatively effective and practical. In the end, some research expectations are given to disinter and popularize this method.

Key words: mixed data set;disjointset;breadthfirstsearch;depthfirstsearch;dual clustering