A novel big data order-preserving matching
algorithm based on similarity filtration

Computer Engineering & Science

Previous Articles Next Articles

A novel big data order-preserving matching

algorithm based on similarity filtration

JIANG Wen-chao1,LIN De-xi1,SUN Ao-bing2,WU Xiao-qiang2

(1.School of Computer,Guangdong University of Technology，Guangzhou 510006;

2.Institute of Guangdong Electronics Industry,Dongguan 523808,China)

Received:2016-12-26 Revised:2017-03-21 Online:2017-07-25 Published:2017-07-25

Abstract

Abstract:

Data order-preserving matching is a key problem in big data applications. Data matching can be transformed into character or number matching through abstraction or reduction. We present a novel data order-preserving matching algorithm based on similarity filtration which includes three steps: data transformation, data reduction and similarity computation. Firstly, to reflect the relation of convex growth (descent) or concave growth (descent), the data is transformed into a binary string according to the relationship among the three neighbor numbers. Secondly, to compute the similarity more accurately, the data array and pattern array are both reduced into stable interval ［0,1］. Finally, according to the variety range of the relevant nodes between data array and pattern array, the similarity can be computed and sorted. Theory analysis shows that the time complex is O(n), which is lower than the algorithm presented by Cho et al. Furthermore, our algorithm can overcome the deficiencies of the algorithm presented by Cho et al. including the incontrollable min-max values and the subsection inconsistency. Based on the similarity computation, all the sub-strings can be sorted for data retrieval or searching in big data applications.

Key words: big data application, pattern matching, order-preserving matching, similarity filtration

JIANG Wen-chao1,LIN De-xi1,SUN Ao-bing2,WU Xiao-qiang2.

A novel big data order-preserving matching

algorithm based on similarity filtration

[J]. Computer Engineering & Science.

[1]	CUI Ying. Event extraction in political diplomacy based on similar semantics and dependency syntax [J]. Computer Engineering & Science, 2020, 42(09): 1632-1639.
[2]	WU You-xi，WANG Bo，GAO Xue-dong. An online pattern matchingsolving algorithm under the nonoverlapping condition [J]. Computer Engineering & Science, 2019, 41(12): 2239-2246.
[3]	YU Weisheng1,DENG Wei1,ZHANG Yao2,LI Shuyu1,2. Music popular trends prediction based on time series [J]. Computer Engineering & Science, 2018, 40(09): 1703-1709.
[4]	GAO Guandong1,2,WANG Jing1,LIU Fei1,DUAN Qing1,ZHU Jie1. Point pattern matching based on polar coordinate transform [J]. J4, 2016, 38(02): 331-337.
[5]	WAN Hu1,XU Yuanchao1,2,SUN Fengyun1,YAN Junfeng1. Cache structure design for big data oriented many-core processor [J]. J4, 2015, 37(01): 28-35.
[6]	WANG Hao,ZHANG Lin,ZHANG Qing. An Improved BM Pattern Matching Algorithm Based on Double Character Sequence Checking [J]. J4, 2012, 34(3): 113-117.
[7]	DENG Hui，LIANG Bo，WANG Feng. A Dynamic Protocol Detection Techniquefor the Signature Based on Optimizing the MultiPattern Matching Algorithms [J]. J4, 2010, 32(4): 36-38.
[8]	. [J]. J4, 2008, 30(8): 69-71.

A novel big data order-preserving matching

algorithm based on similarity filtration

PDF

Knowledge

Abstract

Cite this article

share this article

Related Articles 8

Recommended Articles 0

Metrics

Comments