• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 论文 • 上一篇    下一篇

数字图书馆中图编码匿名方法

贾俊杰,陈菲,闫国蕾,邢里程   

  1. (西北师范大学计算机科学与工程学院,甘肃 兰州 730070)
  • 收稿日期:2016-07-15 修回日期:2016-09-28 出版日期:2016-11-25 发布日期:2016-11-25
  • 基金资助:

    兰州市科技计划(20141256)

An anonymous method for Chinese library
classification encoding in digital library

JIA Junjie,CHEN Fei,YAN Guolei,XING Licheng   

  1. (School of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2016-07-15 Revised:2016-09-28 Online:2016-11-25 Published:2016-11-25

摘要:

现如今数字图书馆所发布的大部分数据只包含图书资源的相关信息,并没有用户属性与图书资源共同发布的数据,使得分析者不能从现有发布数据中分析出更多的信息,对有些科学研究造成困扰。建立一种用户属性与图书信息共同发布的匿名方式,首先将所有图书使用图书分类号进行重新编码,其次根据重新编码的稀疏情况将整个数据进行划分,最后在每个划分中使用置换方法进行匿名。实验结果表明,最终匿名表的数据具有较高的准确性和实用性,并能够通过散点图的方式直观地看到属性间的关系,为科学研究提供更多有用信息。

关键词: 数字图书馆, 数据发布, 隐私保护, k, e匿名, 散点图

Abstract:

Now most of the data published in digital library is of  book resources, however, data with user attributes and library resources is rarely published. The analysts therefore cannot obtain much information from existing released data, which causes problems for some scientific research. We propose an anonymous method which publishes user attributes and book information together. We first recode all books using Chinese library classification numbers, divide the whole data according to the sparse conditions of recoding,  and use the substitution method to anonymously encode in each partition. Experimental results show that the final data has higher accuracy and practicability, and that it is able to directly identify the relationship between attributes via the scatter plot, thus providing more useful information for scientific research.

Key words: digital library, data published, privacy protection, (k,e)anonymity, scatter diagram