• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2021, Vol. 43 ›› Issue (11): 2069-2070.

• 人工智能与数据挖掘 • 上一篇    下一篇

基于自注意力机制的多模态语义轨迹预测

刘婕1,2,张磊1,2,朱少杰1,2,刘佰龙1,2,张雪飞3   

  1. (1.中国矿业大学矿山数字化教育部工程研究中心,江苏 徐州 221116;

    2.中国矿业大学计算机学院,江苏 徐州 221116;3.内蒙古广纳信息科技有限公司,内蒙古 乌海 016000)



  • 收稿日期:2020-07-05 修回日期:2020-10-13 接受日期:2021-11-25 出版日期:2021-11-25 发布日期:2021-11-23
  • 基金资助:
    中国矿业大学建设双一级专项资金(2018ZZCX14)

A multi-modal semantic trajectory prediction model based on self-attention mechanism

LIU Jie1,2,ZHANG Lei1,2,ZHU Shao-jie1,2,LIU Bai-long1,2,ZHANG Xue-fei3#br#

#br#
  

  1. (1.Engineering Research Center of Mine Digitalization,Ministry of Education,
    China University of Mining and Technology,Xuzhou 221116;

    2.School of Computer Science,China University of Mining and Technology,Xuzhou 221116;

    3.Inner Mongolia Guangna Information Technology Co.,Ltd.,Wuhai 016000,China)
  • Received:2020-07-05 Revised:2020-10-13 Accepted:2021-11-25 Online:2021-11-25 Published:2021-11-23
  • Supported by:

摘要: 随着社交媒体的快速发展,多模态语义轨迹的预测成为新的挑战。轨迹点间的依赖关系在预测中起到重要作用,同时也存在着以下挑战:轨迹信息中包含多种模态信息(时间、兴趣点和活动文本等),存在时间、空间和活动意图等多种依赖,这些依赖关系很复杂,现有方法很难量化这些复杂依赖关系。为了解决以上问题,提出一种基于自注意力机制的多模态语义轨迹预测模型SAMSTP。SAMSTP先对多模态特征进行联合嵌入,再设计自注意力机制结合Position Encoding计算轨迹点之间的特征相似度,自动学习并量化复杂依赖权重,同时解决轨迹的长期依赖关系。最后,采用LSTM网络处理轨迹时序关系,并设计模式规范化机制解决依赖关系失真问题,加快模型收敛速度。在真实数据集上的实验结果表明,SAMSTP是有效的,并且优于现有最新方法。

关键词: 多模态, 语义轨迹, 自注意力机制, 模式规范化

Abstract: With the rapid development of social media, multi-modal semantic trajectory prediction has become a new challenge. The dependency between trajectory points plays an important role in prediction, but there are the following challenges: Trajectory sequence contains multiple modal information (time, points of interest, and activity text), so the multiple dependencies containing time, space, and activity intention exist and are complex. It is difficult for existing methods to quantify these complex dependencies. To solve the above problems, a Self-Attention mechanism based Multi-modal Semantic Trajectory Prediction model (SAMSTP) is proposed. Firstly, SAMSTP embeds multi-modal features jointly, designs a self-attention mechanism combined with Position Encoding to calculate the feature similarity among trajectory points, automatically learns and quantifies complex dependencies weights, and resolves the long-term dependency. Finally, LSTM network is used to capture the temporal relationship, and mode normalization is designed to prevent the dependency distortion and accelerate the convergence speed. Experiments on real datasets show that SAMSTP is effective and superior to the latest methods.


Key words: multi-modal, semantic trajectory, self-attention, mode normalization