• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

面向云存储的基于全同态密码技术的文档相似度计算方法

江小平,张巍,李成华,周航,孙婧   

  1. (中南民族大学电子信息工程学院,湖北 武汉 430074)
  • 收稿日期:2015-11-13 修回日期:2016-06-07 出版日期:2017-10-25 发布日期:2017-10-25
  • 基金资助:

    中央高校基本科研业务费专项资金(CZW15043,CZQ14001);湖北省自然科学基金(2014CFB916)

A document similarity calculation method based on fully
homomorphic encryption technology for cloud storage

JIANG Xiao-ping,ZHANG Wei,LI Cheng-hua,ZHOU Hang,SUN Jing   

  1. (College of Electronics and Information Engineering,South-Central University for Nationalities,Wuhan 430074,China)
     
  • Received:2015-11-13 Revised:2016-06-07 Online:2017-10-25 Published:2017-10-25

摘要:

针对云存储服务中存在的用户隐私保护需求,提出了一种在密文状态下的文档相似度计算方法。数据拥有者将文档ID、加密后的文档密文以及文档simhash值的密文上传到云服务器中;云服务提供者进行待计算相似度文档的simhash密文值和数据拥有者文档simhash密文值的全同态加法运算,获得文档间汉明距离的密文;数据拥有者解密汉明距离密文获得文档相似度排序结果。云端在不获悉数据内容及其simhash明文的情况下完成数据对象相似度运算,保护了数据隐私。给出了该方法的详细过程及相关的实验数据,验证了该方法的可行性。

关键词: 云存储服务, 全同态密码技术, 文档相似度计算, simhash, 隐私保护

Abstract:

In order to preserve user privacy in cloud storage services, we propose a method for calculating the similarity of documents under the ciphertext environment. After the data owner uploads the document ID, the ciphertext of document and the ciphertext of document simhash to Cloud servers, the cloud server performs fully homomorphic addition operations on the simhash ciphertext of the document whose similarity is expected and the simhash ciphertext of the data owner's document. Then the ciphertext of the Hamming distance between documents  is obtained. The data owner can get the results of document similarity ranking by decrypting the ciphertext of the Hamming distance. The goal of privacy preservation can be achieved by this method because the cloud server can complete  similarity calculation without any plaintext information, neither the document text nor its simhash value. We explain the proposed method in detail and the related experimental data verify its feasibility and correctness.

Key words: cloud storage service, fully homomorphic encryption technology, document similarity calculation, simhash, privacy preservation