• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (07): 1263-1273.

• 图形与图像 • 上一篇    下一篇

基于注意力增强的中心差分自适应图卷积的骨架行为识别

白杉,冯秀芳   

  1. (太原理工大学软件学院,山西 晋中 030600)
  • 收稿日期:2022-03-07 修回日期:2022-05-16 接受日期:2023-07-25 出版日期:2023-07-25 发布日期:2023-07-11
  • 基金资助:
    山西省重点研发计划(202102020101007)

Skeleton behavior recognition based on attention-enhanced central difference adaptive graph convolution

BAI Shan,FENG Xiu-fang   

  1. (School of Software,Taiyuan University of Technology,Jinzhong  030600,China)
  • Received:2022-03-07 Revised:2022-05-16 Accepted:2023-07-25 Online:2023-07-25 Published:2023-07-11

摘要: 近年来,由于图卷积网络在骨架动作识别领域的卓越表现而吸引了许多研究人员的关注,但大多数的图卷积只能聚合节点信息,忽略了中心节点与相邻节点的特征之间的差异。提出了一种基于多感受野注意力机制的中心差分自适应图卷积网络模型MRFAM-CDAGC,它不仅可以自适应地聚合中心节点的图拓扑中的关联节点的信息,而且可以合并相邻节点之间的局部运动信息,聚合中心节点的梯度特征。加入的多感受野的注意力模块,使该网络模型能聚焦更加具有判别力的关键关节和帧信息,从而提高行为识别网络模型的准确率。该网络模型在NTU-RGB-D数据集的2个基准测试上分别达到了89.1%和96.0%的准确率,在大规模的数据集Kinetics上具有通用性,验证了该网络模型在提取时空特征和捕捉全局上下文信息上的优越性。

关键词: 行为识别, 中心差分自适应图卷积, 注意力机制, 骨架识别

Abstract: In recent years, graph convolution network has attracted the attention of many researchers due to its excellent performance in the field of skeleton action recognition. However, most graph convolution can only aggregate node information, ignoring the difference between the features of the central node and adjacent nodes. Therefore, a central difference adaptive graph convolution network MRFAM-CDAGC based on multiple receptive fields attention mechanism is proposed. It not only adaptively aggregates the information of associated nodes in the graph topology of the central node, but also merge the local motion information between adjacent nodes and aggregate the gradient characteristics of the central node. The attention module with multiple receptive fields is added to make the model focus on the information of more discriminative joints and frames, so as to improve the accuracy of model recognition. Under the two baselines of NTU-RGB-D data sets, the accuracy rates of the model reach 89.1% and 96.0% respectively. The universality of the model is reflected in the dynamics of large-scale data set, which verifies the superiority of the algorithm in extracting spatiotemporal features and capturing global context information.

Key words: behavior recognition, central difference adaptive graph convolution, attention mechanism, skeleton recognition