• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2007, Vol. 29 ›› Issue (1): 103-104.

• 论文 • 上一篇    下一篇

利用潜在语义分析和关联规则挖掘构造同义与关联词集

张文东 易轶虎   

  • 出版日期:2007-01-01 发布日期:2010-05-30

  • Online:2007-01-01 Published:2010-05-30

摘要:

由于大量同义词和关联词的存在,使得在文本挖掘过程中文本特征空间无法准确表达文本语义以及计算高维复杂性。本文利用潜在语义分析和关联规则挖掘构造同义和关联词集,用于减少文本特征空间中的同义词和关联词,降低信息冗余,改进挖掘效率。文中对相应的算法进行了描述,实验结果令人满意。

关键词: 文本挖掘 潜在语义分析 关联规则挖掘 算法

Abstract:

The synonyms and association words are so many in text mining that the text eigenspace not precisely express text semantics and compute the complexity of calculation. The paper integrates the latent semantic analysis with the mining of association rules and puts forward a method of constructing the se  t of synonyms and association words.We use the two sets to reduce the redundancy of information and improve the efficiency of mining.The corresponding a  lgorithm is described and the result of experiment is satisfactory.

Key words: (text mining;latent semantic analysis;association rule mining;algorithm)