Computer Engineering & Science
Previous Articles Next Articles
JIANG Wen-chao1,LIN De-xi1,SUN Ao-bing2,WU Xiao-qiang2
Received:
Revised:
Online:
Published:
Abstract:
Data order-preserving matching is a key problem in big data applications. Data matching can be transformed into character or number matching through abstraction or reduction. We present a novel data order-preserving matching algorithm based on similarity filtration which includes three steps: data transformation, data reduction and similarity computation. Firstly, to reflect the relation of convex growth (descent) or concave growth (descent), the data is transformed into a binary string according to the relationship among the three neighbor numbers. Secondly, to compute the similarity more accurately, the data array and pattern array are both reduced into stable interval [0,1]. Finally, according to the variety range of the relevant nodes between data array and pattern array, the similarity can be computed and sorted. Theory analysis shows that the time complex is O(n), which is lower than the algorithm presented by Cho et al. Furthermore, our algorithm can overcome the deficiencies of the algorithm presented by Cho et al. including the incontrollable min-max values and the subsection inconsistency. Based on the similarity computation, all the sub-strings can be sorted for data retrieval or searching in big data applications.
Key words: big data application, pattern matching, order-preserving matching, similarity filtration
JIANG Wen-chao1,LIN De-xi1,SUN Ao-bing2,WU Xiao-qiang2.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2017/V39/I07/1249