• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (02): 350-355.

• 论文 • 上一篇    下一篇

基于邻域信息熵度量数值属性快速约简算法

李少年,吴良刚   

  1. (中南大学商学院,湖南 长沙 410083)
  • 收稿日期:2015-03-18 修回日期:2015-07-01 出版日期:2016-02-25 发布日期:2016-02-25
  • 基金资助:

    国家自然科学基金委创新群体项目 (70921001);中国移动通信集团业务支撑重点联合研发项目(2014_LH_21)

An effective continuous attributes reduction algorithm
based on neighborhood entropybased measurement         

LI Shaonian,WU Lianggang   

  1. (School of Business,Central South University,Changsha 410083,China)
  • Received:2015-03-18 Revised:2015-07-01 Online:2016-02-25 Published:2016-02-25

摘要:

阐述邻域粗糙集和邻域信息熵的基本定义及性质,为避免数值属性信息系统属性约简过程中,属性离散化造成特征信息的丢失,提出一种新的基于邻域信息熵度量数值属性约简算法。扩展邻域信息系统核属性集生成约简属性集,邻域信息熵度量不仅关注约简属性集正域变化,而且考察负域样本空间约简属性邻域等价类在决策属性划分的分布,具备更好的邻域关系度量细粒度。实验表明,对比邻域粗糙集近似度量、邻域有效信息率度量、邻域软间隔度量的属性约简方法,该算法能有效进行邻域信息系统属性约简的同时,也保持了约简属性集更好的分类精度。

关键词: 属性约简, 邻域信息熵度量, 核属性, 邻域信息系统, 负域样本空间, 分类精度

Abstract:

The paper elaborates the basic definitions and properties of neighborhood rough sets and neighborhood entropy. To avoid losing feature information caused by discretization of continuous attributions while reducing attributions, we present a new algorithm of continuous attributions reduction based on neighborhood entropybased measurement. In the process of expending from core attribute sets to the reduction of attribute sets in neighborhood information system(NIS), neighborhood entropybased measurement is not only concerned with the positive field change of the reduction of attribute sets, but examines the distribution characteristics of the neighborhood equivalence classes of sample space in negative field in the decision attribute partition, which possess the finer granularity in the measurement of neighborhood relationship. Experimental results with UCI standard datasets show that compared with those attributions reduction algorithms based on neighborhood approximation measurement, neighborhood effective information ratio measurement, and neighborhood soft margin measurement, the proposed algorithm can effectively reduce continuous attributions in NIS, and at the same time, it maintains better classification accuracy of the reduction of attribute sets.

Key words: attribute reduction;neighborhood entropybased measurement;core attribute;neighborhood information system;sample space in negative field;classification accuracy