• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2011, Vol. 33 ›› Issue (9): 34-41.

• 论文 • 上一篇    下一篇



  1. (国防科学技术大学计算机学院,湖南 长沙 410073)
  • 收稿日期:2009-09-27 修回日期:2009-12-23 出版日期:2011-09-25 发布日期:2011-09-25
  • 作者简介:刘伍颖(1980),男,江西九江人,博士生,CCF会员(E200011071M),研究方向为文本分类、信息过滤和机器学习。王挺(1970),男,湖南长沙人,博士,教授,CCF会员(E200007590S),研究方向为自然语言处理和计算机软件。
  • 基金资助:


Ensemble Learning and Active Learning Based Personal Spam Email Filtering

LIU Wuying,WANG Ting   

  1. (School of Computer Science,National University of Defense Technology,Changsha 410073,China)
  • Received:2009-09-27 Revised:2009-12-23 Online:2011-09-25 Published:2011-09-25



关键词: 垃圾邮件过滤, 个性化, 集成学习, 主动学习, 支持向量机


This paper proposes a personal spam email filtering method, which can learn a user’s interests and update it automatically according to the user’s feedback. The proposed method extracts the linguistic features and behavior ones to build some rulebased individual filters, and uses the SVM ensemble learning method to combine the multifilter’s results. Applying an active learning method to choose those knowledgeable emails with the user’s labels, the method can minimize the number of labeled emails and reach steadystate performance more quickly. The experimental results show the personal filtering method based on ensemble learning and active learning can capture personality, and achieve high performance with the considerations on accuracy, efficiency and learning ability.

Key words: spam email filtering;personal;ensemble learning;active learning;SVM