• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2010, Vol. 32 ›› Issue (6): 115-117.doi: 10.3969/j.issn.1007130X.2010.

• 论文 • 上一篇    下一篇

基于条件随机场的英文产品命名实体识别

张朝胜1,郭剑毅1,2,线岩团1,2,余正涛1,2,雷春雅1, 王海雄1   

  1. (1.昆明理工大学信息工程与自动化学院,云南 昆明 650051; 2.云南省计算机技术应用重点实验室智能信息处理研究所,云南 昆明 650051)
  • 收稿日期:2009-09-26 修回日期:2009-12-21 出版日期:2010-06-01 发布日期:2010-06-01
  • 通讯作者: 张朝胜 E-mail:wonsodier@yahoo.com.cn
  • 作者简介:张朝胜(1983),男,河南濮阳人,硕士生,研究方向为自然语言处理和信息抽取;郭剑毅,教授,研究方向为模式识别和信息抽取。
  • 基金资助:

    国家自然科学基金资助项目(60863011);云南省自然科学基金重点项目(2008CC023);云南省中青年学术技术带头人后备人才项目(2007PY0111);云南省教育厅基金重点项目 (07Z11139)

Named Entity Recognition of the Products with English Based on Conditional Random Fields

ZHANG Chaosheng1,GUO Jianyi1,2,XIAN Yantuan1,2,YU Zhengtao1,2,LEI Chunya1,WANG Haixiong1   

  1. (1.School of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650051;
    2.Institute of Intelligent Information Processing,Yunnan Provincial Computer
    Technology Application Key Laboratory,Kunming 650051,China)
  • Received:2009-09-26 Revised:2009-12-21 Online:2010-06-01 Published:2010-06-01

摘要:

英文产品命名实体识别目前国内外研究得较少,本文针对TREC 2009英文产品命名实体(EPNE)识别的任务,首次提出了一种基于条件随机场模型(CRF)的英文产品命名实体识别方法。在条件随机场中,该方法以词作为切分粒度,充分利用上下文和英文产品名特有的指示信息作为分类特征,结合手工构建的品牌词表进行建模。实验表明,该方法获得了较好的结果,英文产品实体识别准确率达到 93.6%,召回率达到92.4% 。

关键词: 英文产品, 条件随机场, 特征选择, 命名实体识别

Abstract:

Recently, as the scarce research of Named Entity Recognition of the Products with English domesticaly and abroad, this paper aims at the task of TREC 2009, and proposes a method for the Named Entity Recognition of the Products with English based on the Conditional Random Fields model. In the conditional random field, this paper adopts word as the grain processing unit; uses the context information and especial cue information of English product names  as the features of recognition, and adopts the artificially constructed dictionary to build the model. A good result has been obtained by the experiment and the accuracy rate of the English product entity recognition is up to 93.6%, and the recall rate 92.4%.

Key words: english product, CRF;feature selection;recognition of named entity

中图分类号: