Abstract

Abstract:

A prohlem with the algorithms of clustering analysis is that their results are always not statistically tested. An algorithm of clustering analysis wi th randomized statistical testing is developed in this paper. It consists of three parts： calculation of distance measures, randomized testing, and hie erarchical clustering. In this algorithm the between-sample distance is defined as the 1-p_test value, where the p_test value is calculated from the ran domization procedure for the two samples. If the between-class distance meets with the p_test criterion it will be statistically reasonable to combine t he two classes into one class. Fourteen distance measures and three methods of hierarchical clustering are given. The algorithm is implemented as the ne twork program with the Java language which is comprised of 6 Java classes and a HTML file. The program can run on Java-enabled Web browsers. This algori thm is tested with the investigation of rice invertebrate diversity. The criteria for choosing distance measures and the perspective for improving the a lgorithm are disussed.

Key words: cluster analysis, randomized statistical resting, distance measure;algorithm;network implementation

[1]	GAO Xing1,LIU Jian-fei1,HAO Lu-guo2,DONG Qi-qi1. A training set optimization and detection method based on YOLOv3 algorithm [J]. Computer Engineering & Science, 2020, 42(01): 103-109.
[2]	LIU Yun-peng. An empirical study of learning situation data analysis based on mobile cloud teaching platform: A case study of “dynamic website design” course [J]. Computer Engineering & Science, 2019, 41(增刊S1): 119-123.
[3]	LI Xin-jian，LIU Man-dan. Correlation measurement of campus wireless network users based on the shortest time distance [J]. Computer Engineering & Science, 2019, 41(10): 1755-1762.
[4]	DU Jia-xing,CHEN Ya-wei,ZHANG Jing. Distance rectification indoor localization based on cluster analysis optimization [J]. Computer Engineering & Science, 2018, 40(02): 246-254.
[5]	WEN Jing，CAO Yan，ZHANG Lin，MU Xiang-wei. A clustering analysis algorithm based on double genetic algorithm [J]. Computer Engineering & Science, 2017, 39(12): 2320-2325.
[6]	SHEN Weichao1,2，CAO Liqiang2，XIA Fang1,2. Large scale particle cluster identification and analysis [J]. J4, 2013, 35(11): 62-67.

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 6

Recommended Articles

Metrics

Comments