J4 ›› 2007, Vol. 29 ›› Issue (10): 63-64.
• 论文 • 上一篇 下一篇
麻会东 刘国华 李旭 梁鹏 刘春辉 张凌宇
出版日期:
发布日期:
Online:
Published:
摘要:
文档复制检测技术在保护知识产权和信息索引中起重要作用,它可以防止剽窃事件的发生,提高互联网检索效率。目前,英文复制检测技术已经比较成熟,但中文复制检测技术研 究还处于起步阶段。本文提出一种基于关键词的指纹提取方法;提出k-words方法分解句子;定义了数字指纹树概念,并用数字指纹树来存储指纹。最后,用实验验证了所提出的 方法。
关键词: 指纹 剽窃 文本块 匹配
Abstract:
The technique of copy detection plays an important role in intellectual property proteetion and information retrieval, whieh can prevent plagiarism and improve the retrieval effieieney of the Internet. Now, the copy detection technique of English has become mature. However, the copy deteetion technique of Chinese is in the first step. An extraeting fingerprinting method based on key words has been proposed. The K-words method is proposed to decomposesentenees. The coneept of digital fingerprinting tree whieh is used to store the fingerprints has been defined. Finally,the method is validated by expe riments.
Key words: (fingerprint, plagiarism, chunk, match)
麻会东 刘国华 李旭 梁鹏 刘春辉 张凌宇. 基于提取关键词的中文文档复制检测研究[J]. J4, 2007, 29(10): 63-64.
0 / / 推荐
导出引用管理器 EndNote|Ris|BibTeX
链接本文: http://joces.nudt.edu.cn/CN/
http://joces.nudt.edu.cn/CN/Y2007/V29/I10/63