• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2008, Vol. 30 ›› Issue (7): 73-76.

• 论文 • 上一篇    下一篇

基于特征直方图均衡化技术的ID.3算法实现

唐玉鹏[1] 梁光明[1] 林嘉宇[1] 李文丰[1] 张军[2]   

  • 出版日期:2008-07-01 发布日期:2010-05-22

  • Online:2008-07-01 Published:2010-05-22

摘要:

ID.3算法是经典的决策树算法,而样本集分布不均衡性会对树的结构和识别效果产生较大影响。本文在分析显微镜下细胞识别库样本分布规律基础上,利用直方图均衡化技术对样本特征分布进行变换处理,使整个特征分布规律转变为[0,1]区间内近似均匀分布。实验表明,基于特征直方图均衡化技术实现的ID.3算法收敛速度加快,产生的决  策树平均深度降低。

关键词: ID.3 量化 特征直方图 均衡化 平均深度

Abstract:

The ID. 3 algorithm is a classic decision tree algorithm, and the imbalance of the samples' distribution has a great influence on the tree's structu re and its classification effect. The paper uses the technology of feature histogram equalization to transfer the laws to the uniform distribution law i n [0,1], based on the analysis of the cell recognization feature database under the microscope. Experiments prove that the convergence speed of ID. 3's error rate is qulckened,and the average depth of the decision tree is shortened by this method.

Key words: ID. 3, quantization, feature histogram, equalization, average depth