• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

无重叠条件模式匹配的在线求解算法

武优西,王博,高雪冬   

  1. (河北工业大学人工智能与数据科学学院,天津 300401)
  • 收稿日期:2019-04-18 修回日期:2019-06-25 出版日期:2019-12-25 发布日期:2019-12-25
  • 基金资助:

    国家自然科学基金(61976240,61702157)

An online pattern matchingsolving algorithm
 under the nonoverlapping condition
 

WU You-xi,WANG Bo,GAO Xue-dong   

  1. (School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China)
  • Received:2019-04-18 Revised:2019-06-25 Online:2019-12-25 Published:2019-12-25

摘要:

无重叠条件模式匹配是众多间隙约束的模式匹配算法中的一种,尽管当前证明了无重叠条件模式匹配是一个多项式时间复杂度问题,并提出了有效的求解算法,但是当前求解算法采用离线计算方式,具有空间复杂度较高的缺点。为了解决该问题,设计了一种在线求解算法,该算法一边读入序列串,一边在流网树中寻找符合约束条件的树根-树叶路径,以快速剪枝无用节点,从而加快了匹配速度。与离线算法的空间复杂度相比,在线算法的空间复杂度为O(m×maxlen×W),这里m,maxlen和W分别表示模式串长度、模式最大长度约束和最大间隙约束。实验结果不仅验证了算法的完备性,与现有算法相比,在内存占用上均有较大性能的提升。
 
 

关键词: 模式匹配, 间隙约束, 无重叠条件, 在线算法, 网树

Abstract:

Pattern matching under the nonoverlapping condition is one of many pattern matching algorithms for gap constraint problems.Although it is proved that pattern matching under the nonoverlapping condition is a polynomial time complexity problem and an effective solving algorithm is proposed, the current solving algorithm adopts offline calculation, which has the disadvantage of high spatial complexity.In order to solve this problem, this paper designs an online solving algorithm. The algorithm reads the sequence string and finds the root-leaf path in the flow net tree that meets the constraint conditions, which realizes fast pruning of useless nodes and thus speeds up the matching speed.Compared with the spatial complexity of the offline algorithm, the spatial complexity of the online algorithm is O(m×maxlen×W), where m, maxlen, and W represent the pattern string length, the maximum pattern length constraint and the maximum gap constraint, respectively.Experimental results not only verify the completeness of our proposed algorithm, but also demonstrate that it consumes less memory than the existing algorithm.

Key words: pattern matching, gap constraint, nonoverlapping condition, online algorithm, Nettree