• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2012, Vol. 34 ›› Issue (11): 175-179.

• 论文 • 上一篇    下一篇

一种基于语义相似度的信息资源语义聚类算法

熊芳1,黄宏斌2,黄玉成1,冯嵩1,胡建中1   

  1. (1.中南大学湘雅医院网络信息科,湖南 长沙 410008;2.国防科学技术大学信息系统工程重点实验室,湖南 长沙 410073)
  • 收稿日期:2012-05-07 修回日期:2012-09-21 出版日期:2012-11-25 发布日期:2012-11-25
  • 基金资助:

    教育部中央高校基本科研业务费资助项目(2010QYYL005)

An Approach of Information Semantic Clustering Based on Semantic Similarity

XIONG Fang1,HUANG Hongbin2,HUANG Yucheng1,FENG Song1,HU Jianzhong1   

  1. (1.Department of Network and Information,Xiangya Hospital,Central South University,Changsha 410008;2.Key Laboratory of  Science and Technology of Information System Engineering,National University of Defense Technology,Changsha 410073,China)
  • Received:2012-05-07 Revised:2012-09-21 Online:2012-11-25 Published:2012-11-25

摘要:

根据各分布信息源信息单元实体类的语义相似度,对于信息单元实体类进行聚类,是半自动地进行本体映射、构建分布异构信息资源全局视图的重要步骤。本文面向分布信息资源统一信息视图构建需求,利用基于本体的元数据模型及语义相似度,在其基础上定义了语义聚类特征,基于语义聚类特征设计了一种基于语义特征树的混合层次聚类算法SCFBHCA。从理论和实验两个角度对SCFBHCA算法进行了分析,对比HCA和HCP,该算法具有增量式和扩展性且效率更高。

关键词: 本体, 元数据模型, 语义相似度, 语义聚类特征, 聚类

Abstract:

Clustering of information unit entity based on the semantic similarity between the distribute information source is the important step of global view construction for information sharing in the virtual organization. This paper orienting the demands of constructing the global unified view of distribute information, using a metadata model based on ontology and the semantic similarity, defines the semantic clustering feature(SCF). And with the definition of SCF, this paper designs a SCF based hybrid hiberarchy clustering algorithm, and presents the analysis of the algorithm from theory and experiment.

Key words: ontology;metadata model;semantic similarity;semantic cluster feature;clustering