• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学

• 人工智能与数据挖掘 • 上一篇    下一篇

基于排列熵的SAX特征表示方法复杂度及相关特性研究

宋伟1,宋玉1,张帆2,范明1,叶阳东1   

  1. (1.郑州大学信息工程学院,河南 郑州 450001;2.华北水利水电大学信息工程学院,河南 郑州 450046)
  • 收稿日期:2017-09-17 修回日期:2018-01-24 出版日期:2018-07-25 发布日期:2018-07-25
  • 基金资助:

    国家自然科学基金(61170223,61772475);河南省科技攻关项目(172102410065,172102210011);河南省高等学校重点科研项目(17A520057)

A study on complexity and relative intrinsic properties of
SAX representation method based on permutation entropy

SONG Wei1,SONG Yu1,ZHANG Fan2,FAN Ming1,YE Yangdong1   

  1. (1.School of Information Engineering,Zhengzhou University,Zhengzhou 450001;
    2.School of Information Engineering,North China University
    of Water Resources and Electric Power,Zhengzhou 450046,China)
     
  • Received:2017-09-17 Revised:2018-01-24 Online:2018-07-25 Published:2018-07-25

摘要:

符号化聚合近似SAX方法是典型且行之有效的符号化特征表示方法。目前对SAX方法实践应用较多,然而对其内在特性,如复杂度、信息损失、相关性及周期性等方面的分析研究却相对较为少见。运用排列熵来度量SAX方法的复杂度及相关特性的统计学特征,通过在实验数据集以及真实生理数据上的实验表明,SAX方法可以明显降低特征表示的复杂度,冗余效应也得到了缓解;此外,SAX较好地保留了采用自相关函数ACF度量的内在相关性。本文工作可以对SAX方法及其进一步应用提供支撑,为新的符号化特征表示方法的设计与评估提供分析与统计工具。

关键词: 排列熵, 符号化聚合近似, 内在特性, 复杂度, 相关性

Abstract:

Symbolic aggregate approximation (SAX) is a typical and effective symbolic feature representation method. At present, there are many practical applications of the SAX method, but the analysis of its intrinsic properties, such as complexity, information loss, correlation and periodicity, is relatively rare. In this paper, we apply the permutation entropy to analyze the complexity and statistics characteristics of the relative intrinsic properties of the SAX method. Experiments on benchmark datasets and real clinical data show that the SAX method can significantly reduce the complexity of feature representation and alleviate the redundancy effect while preserving the inherent correlation measured by the autocorrelation function ACF. This work can provide support for the SAX method and its further application, and provide analytical and statistical tools for the design and evaluation of new symbolic representation methods.
 
 

Key words: permutation entropy, symbolic aggregate approximation, intrinsic properties;complexity, correlation