• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

Dynamic rule induction for hybrid information
systems based on neighborhood granulation
 

CHENG Yi1,2,LIU Yong3   

  1. (1.College of Computer Science,Sichuan University,Chengdu 610000;
    2.Department of Information and Engineering,Sichuan College of Architectural Technology,Chengdu 610000;
    3.Department of Equipment Engineering,Sichuan College of Architectural Technology,Deyang 618000,China)
     
  • Received:2018-09-03 Revised:2018-12-21 Online:2019-07-25 Published:2019-07-25

Abstract:

The data types of existing knowledge discovery models of hybrid information systems are mostly symbolic, numerical conditional and symbolic decision attributes. Most of the models focus on attribute reduction or feature selection, but research on rule extraction is relatively few. We construct a dynamic rule induction model for hybrid information systems covering more data types. Firstly, the existing formulas for calculating value differences of different types of attributes are modified, and a definition of the distance of cross-level symbolic values is given, thus a new mixed distance is defined. Secondly, we propose three methods to induce the decision class for numerical decision attributes. Then, we propose a generalized neighborhood rough set model based on neighborhood granulation, and the lower and upper approximations of an arbitrary subset under dynamic granulation are presented, which underlies a foundation for the construction of a dynamic rule induction algorithm. The model can be used to extract rules from the information systems with the following features, namely: (1) condition attribute set includes singlelevel symbolic, crosslevel symbolic, numeric, intervalvalued, setvalued and missing data; (2) decision attribute set can include symbolic and numeric data. The rule induction algorithm is evaluated on several data sets from the UC Irvine Machine Learning Repository. Experimental results show that the algorithm can achieve good classification performance.
 

 

Key words: rule induction, hybrid information systems, granularity, neighborhood