• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (5): 141-145.

• 论文 • 上一篇    下一篇

一种新的决策树模型在就业分析中的应用

常志玲,王岚   

  1. (洛阳师范学院信息技术学院,河南 洛阳 471022)
  • 收稿日期:2010-06-22 修回日期:2010-10-08 出版日期:2011-05-25 发布日期:2011-05-25
  • 作者简介:常志玲(1976),女,河南濮阳人,硕士,讲师,研究方向为粗糙集理论和计算智能。王岚(1967),女,河南洛阳人,硕士,教授,研究方向为数据挖掘。
  • 基金资助:

    洛阳师范学院教学改革项目(200826);河南省教育厅自然科学研究计划项目(2010A520030)

Data Mining in Employment Based on a New Decision Tree

CHANG Zhiling,WANG Lan   

  1. (School of Information Technology,Luoyang Normal University,Luoyang 471022,China)
  • Received:2010-06-22 Revised:2010-10-08 Online:2011-05-25 Published:2011-05-25

摘要:

决策树是数据挖掘中常用的分类方法。针对高等院校学生就业问题中出现由噪声造成的不一致性数据,本文提出了基于变精度粗糙集的决策树模型,并应用于学生就业数据分析。该方法以变精度粗糙集的分类质量的量度作为信息函数,对条件属性进行选择,作为树的节点,自上而下地分割数据集,直到满足某种终止条件。它充分考虑了属性间的依赖性和冗余性,允许在构造决策树的过程中划入正域的实例类别存在一定的不一致性。实验表明,该算法能够有效地处理不一致性数据集,并能正确合理地将就业数据分类,最终得到若干有价值的结论,供决策分析。该算法大大提高了决策规则的泛化能力,减化了树的结构。

关键词: 决策树, 变精度粗糙集, 学生就业, 决策规则

Abstract:

Decision tree is a usual method of classification in data mining. In this paper, a new heuristic function to build decision trees based on the variable precision rough set is proposed for the inconsistency in the employment of university graduates. The measure of quality of classification acts as an information function to select the condition attribute in this method, and the condition attribute is to be the decision tree node to divide the data set. The dependency and redundancy between attributes are considered; especially a certain inconsistency is allowed to exist in the examples of the positive regions. The method classifies the  data of employment correctly and finds some valuable results for analysis and decision, and it  simplifies the decision trees and improves the extensive ability of decision rules.

Key words: decision tree;variable precision rough set;employment of university graduates;decision rules