• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2015, Vol. 37 ›› Issue (9): 1756-1760.

• 论文 • Previous Articles     Next Articles

Abnormal data detection algorithm
based on conditional random fields model  

WANG Wenke1,WEN Yamei2,CAI Zhe2   

  1. (1.College of Computer,National University of Defense Technology,Changsha 410073;
    2.Information Center,Hunan Tobacco,Changsha 410004,China)
  • Received:2014-07-08 Revised:2014-10-21 Online:2015-09-25 Published:2015-09-25

Abstract:

Data centers are an important auxiliary tool for business leaders to make decisions, and  timely, accurate and scientific data are basic requirements and key principles. It is difficult and inefficient to find out abnormal one in huge amounts of data by human experience. In this paper, we propose an algorithm for detecting abnormal data based on machine learning. Because enterprise sales data consist of a series of relatively fixed data items, they can be recognized as a structured data sequence. Conditional Random Fields (CRFs) model is efficient for structured data sequence prediction, so it can be used as the detection model. A large number of history data are learnt and their intrinsic rules and relationship are analyzed so as to enable computers to detect abnormal data automatically. Experimental result shows the effectiveness of the proposed algorithm.

Key words: data center;machine learning;detection of abnormal data;conditional randomfieldsmodel