Computer Engineering & Science ›› 2010, Vol. 32 ›› Issue (5): 92-96.
Previous Articles Next Articles
XIE Mingxia1,2,GUO Jianzhong1,ZHANG Haibo3,CHEN Ke1
Received:
Revised:
Online:
Published:
Contact:
Abstract:
There exists no comparison between the distances of the objects with the increase
of dimension when the method of distance measurement for low dimensional space is adopted in
high dimensional space. The study of efficient methods for distance measurement or
similarity (dissimilarity) measurement in high dimensional space is very important and
challenging. The improved function HDsim(X,Y) is proposed to measure the similarity between
the objects in high dimensional space through analyzing the inapplicability of the
traditional measurement being used in high dimensional space and summarizing the existing
methods to similarity measurement for high dimensional data. The methods for similarity
measurement to all kinds of data have been integrated by function HDsim(X,Y),which takes
full advantage of the original function Hsim(X,Y) in dealing with numerical data, the
Jaccard coefficient in dealing with the binary data,and the matching ratio in dealing with
the categorical data. Validity and case analysis demonstrate that the function HDsim(X,Y) is
effective in computing the similarity between the objects in high dimensional space.
Key words: high dimensional data, similarity measurement, attribute similarity, spatial similarity similarity
CLC Number:
TP18
XIE Mingxia1, 2, GUO Jianzhong1, ZHANG Haibo3, CHEN Ke1. Research on the Similarity Measurement of High Dimensional Data[J]. Computer Engineering & Science, 2010, 32(5): 92-96.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2010/V32/I5/92