• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

基于位串内容感知的数据分块算法

周斌1,朱容波1,张莹2   

  1. (1.中南民族大学计算机学院,湖北 武汉 430074;2.华中科技大学外国语学院,湖北 武汉 430074)
  • 收稿日期:2016-03-18 修回日期:2016-05-03 出版日期:2016-10-25 发布日期:2016-10-25
  • 基金资助:

    国家自然科学基金(61272497);湖北省自然科学基金(2013CFB447)

A bit string contentaware data chunking algorithm

ZHOU Bin1, ZHU Rongbo1, ZHANG Ying2   

  1. (1.College of Computer Science, SouthCentral University For Nationalities, Wuhan 430074;
    2. School of Foreign Languages, Huazhong University of Science and Technology, Wuhan 430074, China)
  • Received:2016-03-18 Revised:2016-05-03 Online:2016-10-25 Published:2016-10-25

摘要:

针对基于内容的可变长度的分块CDC算法中数字签名计算需要耗费大量CPU开销的问题,提出了一种基于位串内容感知的数据块分块算法。算法利用每一次失败匹配尝试所带来的位特征信息,最大限度地排除不能匹配的位置,从而获得最大的跳跃长度,减少中间计算和比较的开销。实验结果表明,本算法减小了数据分块过程中数字签名计算的开销,降低了确定块边界时的CPU资源消耗,从而优化了数据分块的时间性能。

关键词: 位串内容感知, 数据分块, 数字签名

Abstract:

Aiming at the problem of a large amount of overhead introduced by the content defined chunking algorithm (CDC) in calculating the digital signature, we present a novel data chunking algorithm based on bit string content awareness.The proposed algorithm eliminates unmatched positions to the utmost by taking advantage of the bit feature information acquired through each failure matching.Since the maximum jump length is obtained, intermediate calculation and comparison cost are reduced.Experimental results show that the algorithm can reduce the overhead of digital signature calculation in the process of data chunking, cut down CPU resource consumption for chunk boundary determination, and optimize the time performance of data chunking.

Key words: bit string contentaware, data chunking, digital signature