• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

J4 ›› 2016, Vol. 38 ›› Issue (02): 255-261.

• 论文 • 上一篇    下一篇

时空轨迹大数据分布式蜂群模式挖掘算法

于彦伟1,2,齐建鹏1,陆云辉1,2,赵金东1,张永刚2   

  1. (1.烟台大学计算机与控制工程学院,山东 烟台 264005;
    2.吉林大学符号计算与知识工程教育部重点实验室,吉林 长春 130012)
  • 收稿日期:2015-09-03 修回日期:2015-11-12 出版日期:2016-02-25 发布日期:2016-02-25
  • 基金资助:

    国家自然科学基金(61403328,61572419,61403329);吉林大学符号计算与知识工程教育部重点实验室开放基金(93K172014K13);山东省重点研发项目(2015GSF115009);山东省自然科学基金(ZR2013FM011);山东省高等学校科技计划(J14LN24,J14LN70)

Distributed swarm pattern mining algorithm
in  big spatio-temporal trajectory data   

YU Yanwei1,2,QI Jianpeng1,LU Yunhui1,2,ZHAO Jindong1,ZHANG Yonggang2   

  1. (1.School of Computer and Control Engineering,Yantai University,Yantai 264005;2.Key Laboratory of Symbolic Computation and Knowledge
    Engineering of Ministry of Education,Jilin University,Changchun 130012,China)
  • Received:2015-09-03 Revised:2015-11-12 Online:2016-02-25 Published:2016-02-25

摘要:

针对时空轨迹大数据的蜂群模式挖掘需求,提出了一种高效的基于MapReduce的分布式蜂群模式挖掘算法。首先,提出了基于最大移动目标集的对象集闭合蜂群模式概念,并利用最小时间支集优化了串行挖掘算法;其次,提出了蜂群模式的并行化挖掘模型,利用蜂群模式时间域无关性,并行化了聚类与子时间域上的蜂群模式挖掘过程;第三,设计了一个基于MapReduce链式架构的分布式并行挖掘算法,通过四个阶段快速地实现了蜂群模式的并行挖掘;最后,在Hadoop平台上,使用真实交通轨迹大数据集对分布式算法的有效性和高效性进行了验证与分析。

关键词: 时空轨迹挖掘, 大数据, 蜂群模式, 分布式, MapReduce

Abstract:

We propose an efficient distributed mining algorithm based on MapReduce for mining swarm pattern from big spatiotemporal trajectory data. We first define the objectclosed swarm pattern based on the maximum moving object set, and optimize the serial mining algorithm using the strategy of minimum time support set to minimize the computation costs. We then propose a parallel swarm mining model based on the time independence, and the clustering and the objectclosed swarm mining on the time domain are parallelized. Finally, we propose a distributed mining algorithm based on MapReduce chained architecture, which quickly discovers swarm patterns in big trajectory data by a 4stage framework. Experimental evaluations on the Hadoop platform, using massivescale real world traffic trajectory datasets, demonstrate the effectiveness and efficiency of the proposed distributed algorithm.

Key words: spatiotemporal trajectory mining;big data;swarm pattern;distributed;MapReduce