J4 ›› 2008, Vol. 30 ›› Issue (8): 61-64.
• 论文 • 上一篇 下一篇
张红梅[1] 王利华[2]
出版日期:
发布日期:
Online:
Published:
摘要:
本文针对基于关联规则的文本过滤器设计做了如下探索:(1)针对中文网络语言的特点,引入n-Gram方法提取文本的特征;(2)提出边界样本的概念;(3)在设计基于关 联规则的文本过滤器时,引进了否定选择算法,采用否定选择算法对过滤器的检测器集合进行自体耐受,最终建立高准确率的文本过滤器。实验表明,经过自体耐受的过滤器 可以有效地提高过滤准确率。
关键词: 文本过滤 否定选择算法 n-Gram 关联规则
Abstract:
As for the text filter design based on association rules, the paper makes the following efforts: (1)As for the charateristics of the Chinese web language, we introduce the n-Gram method to extract text features; (2)We propose the concept of edge sample; (3)When designing the text filters based on association rules, we introduce a negative-selection algorithm to make the filters' detector set tolerant, and finally build a high-precision text filter. Experiments show that the filters after proper toleration can effectively increase the precision of filtering.
Key words: text filtering, negative-selection algorithm, n-Gram, association rule
张红梅[1] 王利华[2]. 使用否定选择算法改进文本过滤[J]. J4, 2008, 30(8): 61-64.
0 / / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://joces.nudt.edu.cn/CN/
http://joces.nudt.edu.cn/CN/Y2008/V30/I8/61