J4 ›› 2006, Vol. 28 ›› Issue (12): 74-76.
• 论文 • Previous Articles Next Articles
Online:
Published:
Abstract:
A prohlem with the algorithms of clustering analysis is that their results are always not statistically tested. An algorithm of clustering analysis wi th randomized statistical testing is developed in this paper. It consists of three parts: calculation of distance measures, randomized testing, and hie erarchical clustering. In this algorithm the between-sample distance is defined as the 1-p_test value, where the p_test value is calculated from the ran domization procedure for the two samples. If the between-class distance meets with the p_test criterion it will be statistically reasonable to combine t he two classes into one class. Fourteen distance measures and three methods of hierarchical clustering are given. The algorithm is implemented as the ne twork program with the Java language which is comprised of 6 Java classes and a HTML file. The program can run on Java-enabled Web browsers. This algori thm is tested with the investigation of rice invertebrate diversity. The criteria for choosing distance measures and the perspective for improving the a lgorithm are disussed.
Key words: cluster analysis, randomized statistical resting, distance measure;algorithm;network implementation
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2006/V28/I12/74