• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2008, Vol. 30 ›› Issue (10): 8-10.

• 论文 • 上一篇    下一篇

基于粗糙集理论的双向垃圾邮件分类模型的研究

云炜 段禅伦   

  • 出版日期:2008-10-01 发布日期:2010-05-19

  • Online:2008-10-01 Published:2010-05-19

摘要:

电子邮件是互联网的重要应用之一,邮件分类问题已成为当今研究的热点。本文基于粗糙集理论,利用0-1贝努利数据提出双向邮件分类模型,在保证当前分类正确率的前提  下,约简了邮件分类所需的文本词频信息,较好地提高了分类效率,推进了粗糙集理论在邮件分类中的应用。

关键词: 0-1贝努利数据 邮件分类模型 粗糙集

Abstract:

Electronic mail is one of the most popular services of the Internet. E-mail classification has been a hot topic in recent years. According to the rough set theory, a two-way email classification model is introduced using the 0-1 Bernoulli data. This model retains the accuracy of the results in email c lassification. In this model,the information on the terms~ fre- quency, which is important to the classification of texts, is reduced and the classifica  tion efficiency is increased. This method spreads the application of the rough set theory in email classification.

Key words: 0-1 Bernoulli data, e-mail classification model, rough set