• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2022, Vol. 44 ›› Issue (03): 436-446.

• 计算机网络与信息安全 • 上一篇    下一篇

区块链环境中基于局部敏感哈希的协同过滤推荐研究

汪静,钱晓东   

  1. (兰州交通大学经济管理学院,甘肃 兰州730070)
  • 收稿日期:2020-10-20 修回日期:2021-02-08 接受日期:2022-03-25 出版日期:2022-03-25 发布日期:2022-03-24
  • 基金资助:
    国家自然科学基金(71461017)

Collaborative filtering recommendation based on local sensitive hash in blockchain environment

WANG Jing,QIAN Xiao-dong   

  1. (School of Economics and Management,Lanzhou Jiaotong University,Lanzhou 730070,China)
  • Received:2020-10-20 Revised:2021-02-08 Accepted:2022-03-25 Online:2022-03-25 Published:2022-03-24

摘要: 针对区块链环境中海量高维的数据使得推荐性能低下的问题,通过对局部敏感哈希算法的优化,降低其在近邻搜索过程中带来的额外计算和存储开销。利用数据分布的主成分减少传统LSH中不良捕获的投影方向,同时对投影向量权重进行量化,以减少哈希表和哈希函数的使用;通过对哈希桶的间隔进行调整,并且根据冲突次数的大小进一步细化查询结果集,以显著降低距离计算的时间开销;最后采用加权平均策略进行评分预测并产生推荐列表。实验表明:与其他算法相比,优化后的LSH仅需要少量的哈希表和哈希函数就可以获得较为精确的近邻搜索结果,且搜索效率有很大的提高。优化后的LSH可以很好地应对区块链中数据特点所造成的问题,缓解高维大规模数据对推荐性能的影响,在一定程度上提高了推荐质量和效率。

关键词: 局部敏感哈希, 区块链, 数据分布, 推荐性能, 近邻搜索

Abstract: To solve the issue of low recommendation performance caused by massive high-dimensional data in the blockchain environment, the local sensitive hash algorithm is optimized to reduce the calculation and storage overhead in the nearest neighbor search process. The principal component of the data distribution is used to reduce the poorly captured projection direction in the traditional LSH. Meanwhile, the projection vector weight is quantified, the interval of the hash bucket is adjusted, and the query result set is further refined according to the number of conflicts. Finally, a weighted average strategy is used to predict the score and generate a recommendation list. Experiments show that, compared with other algorithm indexes, the optimized LSH only needs a small amount of hash tables and hash functions to obtain accurate neighbor search results, and the search efficiency is greatly improved. The optimized LSH can well adapt to the characteristics of blockchain data, alleviate the impact of high-dimensional large-scale data on recommendation performance, and improve the recommendation quality and efficiency.

Key words: locality sensitive hashing, blockchain, data distribution, recommendation performance, nearest neighbor search