• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2020, Vol. 42 ›› Issue (07): 1318-1324.doi: 10.3969/j.issn.1007-130X.2020.07.022

• 数据挖掘与人工智能 • 上一篇    下一篇

无偏KL散度算法对时空异常区间检测的优化研究

刘云,王梓宇   

  1. (昆明理工大学信息工程与自动化学院,云南 昆明 650500)

  • 收稿日期:2019-11-29 修回日期:2020-02-07 接受日期:2020-07-25 出版日期:2020-07-25 发布日期:2020-07-27
  • 基金资助:
    国家自然科学基金(61761025)

Optimization of the spatio-temporal anomalous  regions detection by unbiased KL divergence algorithm

LIU Yun,WANG Zi-yu#br#

#br#
  

  1. (Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)


  • Received:2019-11-29 Revised:2020-02-07 Accepted:2020-07-25 Online:2020-07-25 Published:2020-07-27

摘要: 通过对多变量时空时间序列中异常的度量,可以从大量时空事件数据中检测出异常的数据部分。与孤立异常数据点检测采用的技术不同,提出了无偏KL散度算法(UKLD)。首先定义了时空时间序列中的异常区间,嵌入时间延迟后用高斯分布来估计检测区间和剩余区间的分布并通过累计和来加快高斯分布的参数估计过程,最后使用无偏KL散度计算区间之间的差异水平,将这种差异水平作为检测区间的异常得分从而得到时空异常区间。仿真分析结果表明,对比HOT SAX算法和RKDE算法,UKLD算法在精度方面更优,能更好地实现时空数据中的异常区间检测。


关键词: 时空数据, 异常区间检测, 无偏散度, KL散度

Abstract: Through the measurement of anomalies in multivariate spatio-temporal time sequences, it is possible to detect the anomalous regions from a large amount of data of the spatio-temporal events. Different from the techniques for detecting isolated anomalous data points, this paper proposes an unbiased KL divergence algorithm (UKLD). Firstly, the algorithm defines the divergent interval in the spatio-temporal time series. Gaussian distribution is used to estimate the distributions of the scanned interval and the remaining intervals after time-delay embedding, and the parameter estimation process of Gaussian distribution is sped up by using cumulative sums. Finally, the discrepancy level between intervals calculated by the unbiased KL divergence is used as the anomalous score of the scanned interval to obtain the spatio-temporal anomalous intervals. The simulation results show that, compared with HOT SAX algorithm and RKDE algorithm, UKLD is better for the spatio-temporal anomalous intervals detection task in terms of accuracy.


Key words: spatio-temporal data, anomalous regions detection, unbiased divergence, KL divergence