• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

裁判文书类案推送中的案情相似度计算模型研究

王君泽1,2,马洪晶1,2,张毅1,2,杨兰蓉1,2   

  1. (1.华中科技大学公共管理学院,湖北 武汉 430074;2.华中科技大学非传统安全研究中心,湖北 武汉 430074)
  • 收稿日期:2019-03-08 修回日期:2019-09-02 出版日期:2019-12-25 发布日期:2019-12-25
  • 基金资助:

    国家自然科学基金(61602198)

A case similarity calculation model
in case pushing of judicial documents

WANG Jun-ze1,2,MA Hong-jing1,2,ZHANG Yi1,2,YANG Lan-rong1,2   

  1.  (1.College of Public Administration,Huazhong University of Science and Technology,Wuhan 430074;
    2.Non-traditional Security Center,Huazhong University of Science and Technology,Wuhan 430074,China)
     
     
  • Received:2019-03-08 Revised:2019-09-02 Online:2019-12-25 Published:2019-12-25

摘要:

裁判文书的类案推送策略有助于解决司法过程中的裁判尺度不统一、类案不同判、量刑不规范等问题。针对裁判文书类案推送任务,基于裁判文书在篇章结构和语言表述方面的特征,从裁判文书案情内容的抽取、案情内容中不同词性类别词项的权重分析、案情内容中未登录词的识别、案情内容中数量表述的相似度计算等角度展开研究,并设计相应的案情相似度计算模型。通过在真实裁判文书数据集合上的实验,表明了该模型的有效性。
 

关键词: 类案推送, 词性权重, 未登录词识别, 文本相似度

Abstract:

The strategy of pushing similar cases of judicial documents is helpful to solve problems such as the disunity of judgment standard, the difference of judgment with similar cases and the irregularity of sentencing in the judicial process. Aiming at the similar cases pushing strategy of judicial documents, based on the written discourse structure and linguistic expression of judicial documents, we can carry out the research by extracting the contents of judicial documents, analyzing the weights of different speech words, recognizing the unknown Chinese words in the contents, and calculating the quantity expression similarity. Besides, we design the corresponding case similarity calculation model. Experiments on real judicial documents datasets prove the validity of the model.
 

Key words: similar cases pushing, weight of POS, unknown Chinese words recognition, text similarity