• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (11): 2154-2161.

• 论文 • 上一篇    下一篇

面向函数依赖的隐私保护研究

杨高明1,方贤进1,陆奎1,王静2   

  1. (1.安徽理工大学计算机科学与工程学院,安徽 淮南 232001;2.中国民航大学中国民航信息技术科研基地,天津 300300)
  • 收稿日期:2015-08-15 修回日期:2015-10-20 出版日期:2015-11-25 发布日期:2015-11-25
  • 基金资助:

    国家自然科学基金资助项目(61402012, 61572034);安徽省高校自然科学基金资助项目(KJ2014A061);安徽省博士后基金资助项目(2014B021);中国民航信息技术科研基地开放课题基金资助项目(CAACITRB201404)

Privacy preserving method for dataset with function dependence 

YANG Gaoming1,FANG Xianjin1,LU Kui1,WANG Jing2   

  1. (1.School of Computer Science and Engineering,Anhui University of Science and Technology,Huainan 232001;
    2.Information Technology Research Base of Civil Aviation Administration of China,
    Civil Aviation University of China,Tianjin 300300,China)
  • Received:2015-08-15 Revised:2015-10-20 Online:2015-11-25 Published:2015-11-25

摘要:

信息技术的发展为人们生活带来便利的同时也带来了个人隐私泄露的风险,数据匿名化是阻止隐私泄露的有效方法。然而,已有的匿名化方法主要考虑切断准标识符属性和敏感属性之间的关联,而没有考虑准标识符属性之间,以及准标识符属性和敏感属性之间存在的函数依赖关系。针对隐私保护的数据发布中存在的问题,研究数据之间存在函数依赖时,如何有效保护用户的隐私信息。首先针对数据集中存在函数依赖情况,提出(l, α)多样性隐私保护模型;其次,为更好地实现用户隐私保护以及数据效用的增加,提出结合扰动和概化/隐匿的杂合方法实现匿名化算法。最后,实验验证了算法的有效性和效率,并对结果做了理论分析。

关键词: 隐私保护, 数据发布, 函数依赖, 概化, 效用, 扰动

Abstract:

The development of information technology facilitates people’s life, but it also introduces risk of disclosure of personal privacy as well. In general, data anonymization is an effective way to prevent privacy disclosure. However, few existing anonymity principles conside the hostile attacks against datasets with function dependence. So we study the problems of privacy preserving data publishing (PPDP) while functional dependency exists in the datasets, and illustrate how to preserve privacy when privacy information is vulnerable with function dependence. First, we propose a (l,α)diversity privacy model to protect the privacy of individuals. To achieve better privacy protection for users and increase data utility, we use a hybrid method of perturbation and generalization/suppression to achieve an effective anonymous algorithm. We conduct experiments and make a detailed theoretical analysis for the experimental results. Experimental results verify the effectiveness and efficiency of the proposed algorithm.

Key words: privacy-preserving;data publishing;functional dependency;generalization;utility;perturbation