• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (04): 830-834.

• 论文 • 上一篇    下一篇

基于差分隐私保护的DP-DBScan聚类算法研究

吴伟民,黄焕坤   

  1. (广东工业大学计算机学院,广东 广州 510006)
  • 收稿日期:2014-01-13 修回日期:2014-04-03 出版日期:2015-04-25 发布日期:2015-04-25
  • 基金资助:

    广州市科技计划资助项目(2012Y200046)

A DP-DBScan clustering algorithm based on
differential privacy preserving  

WU Weimin,HUANG Huankun   

  1. (School of Computer,Guangdong University of Technology,Guangzhou 510006,China)
  • Received:2014-01-13 Revised:2014-04-03 Online:2015-04-25 Published:2015-04-25

摘要:

差分隐私保护是一种基于数据失真的隐私保护方法,通过添加随机噪声使敏感数据失真的同时也保证数据的统计特性。针对DBScan聚类算法在聚类分析过程中会泄露隐私的问题,提出一种新的基于差分隐私保护的DPDBScan聚类算法。在满足ε差分隐私保护的前提下,DPDBScan聚类算法在基于密度的DBScan聚类算法上引入并实现了差分隐私保护。算法能够有效地保护个人隐私,适用于不同规模和不同维度的数据集。实验结果表明,与DBScan聚类算法相比,DPDBScan聚类算法在添加少量随机噪声的情况下能保持聚类的有效性并获得差分隐私保护。

关键词: 差分隐私, DBScan, DPDBScan, 隐私保护, 数据挖掘

Abstract:

Differential privacy preserving is a privacy preserving method based on data distortion,which protects the sensitive data and keeps the data statistical properties by adding random noise.To protect data privacy for the clustering process of DBScan, we present a novel DP-DBScan clustering algorithm in the framework of differential privacy preserving.Subjected to the restriction on εdifferential privacy, the proposed DP-DBScan clustering algorithm can not only protect personal privacy effectively but can be applied to data sets of different sizes and dimensions.Experimental results show that,compared with the DBScan clustering method,the DP-DBScan clustering algorithm achieves clustering validity as well as differential privacy preserving when a small amount of noise are added.

Key words: differential privacy;DBScan;DP-DBScan;privacy preserving;data mining