• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2010, Vol. 32 ›› Issue (12): 128-133.doi: 10.3969/j.issn.1007130X.2010.

• 论文 • 上一篇    下一篇

基于多级相关图的大规模词典完美哈希函数构造算法

李海涛   

  1. (台州职业技术学院计算机工程系,浙江 台州 318000)
  • 收稿日期:2009-07-14 修回日期:2010-01-15 出版日期:2010-12-25 发布日期:2010-12-25
  • 通讯作者: 李海涛
  • 作者简介:李海涛(1976),男,吉林吉林人,讲师,研究方向为计算机应用。

A Perfect Hash Function for LargeScale Dictionaries Based on MultiStage Dependency Graphs

LI Haitao   

  1. (Department of Computer Engineering,Taizhou Vocational and Technical College,Taizhou 318000,China)
  • Received:2009-07-14 Revised:2010-01-15 Online:2010-12-25 Published:2010-12-25

摘要:

在哈希函数中,如果两个不同的单词被映射到同一个槽,那么我们称为冲突。当哈希函数存在冲突时,将降低词典查找的速度。由于完美哈希函数完全避免了冲突,因此在许多对查找性能要求较高的应用中广泛使用。本文就此提出了一种基于多级相关图的大规模词典完美哈希函数的构造算法。词典单词的每个字符(首字母除外)都用两个平滑函数平滑为两个字符,构建平滑后词典对应的多级相关图,多级相关图的结点度都比较小,而且分布比较均匀,因此更容易生成完美哈希函数。实验表明:基于多级相关图的哈希函数构造算法适用于大规模词典,填充因子接近1,同时工作空间比已有算法都要小。

关键词: 完美哈希函数, 多极相关图, 大规模词典, 平滑

Abstract:

In the hash function, if there are two different words are  mapped into the same slot, we call it a conflict. When the hash function appears in the conflict, it slows the dictionary’s finding speed. Since the perfect hash function completely avoids the conflict, it is used popularly in the application of high function demands. So a perfect hash function based on multistage dependency graphs is presented in this paper. Each character (except for the first one) of a key is smoothed into two smoothed characters using two different smoothing functions. A multistage dependency graph is constructed. The degree of the vertices in the multistage dependency graph of the smoothed keyset is much more evenly distributed, so that a perfect hash function is easier to obtain. The perfect hash function is promising to give solution to large keysets. The load factor remains still high even for huge keysets over large charactersets. Experiments indicate that the load factor is competitive, and meanwhile its working space is less than that of the existing hash functions.

Key words: perfect hash function;multistage dependency graph;lagerscale