• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (01): 95-103.

• 图形与图像 • 上一篇    下一篇

基于PoseC3D的网球动作识别及评价方法

周升儒,陈志刚,邓伊琴   

  1. (中南大学计算机学院,湖南 长沙 410083)
  • 收稿日期:2022-09-19 修回日期:2022-10-25 接受日期:2023-01-25 出版日期:2023-01-25 发布日期:2023-01-25
  • 基金资助:
    长沙市科技计划重大专项基金(kh2103016);科技计划2030(2020AAA0109605)

A tennis action recognition and evaluation method based on PoseC3D

ZHOU Sheng-ru,CHEN Zhi-gang,DENG Yi-qin   

  1. (School of Computer Science and Engineering,Central South University,Changsha 410083,China)
  • Received:2022-09-19 Revised:2022-10-25 Accepted:2023-01-25 Online:2023-01-25 Published:2023-01-25

摘要: 为了准确地识别及评价网球动作,将计算机视觉与网球运动相关知识相结合,提出了一种基于PoseC3D的网球动作识别及评价方法。首先,使用基于ResNet-50姿态估计模型对网球运动视频进行人体目标检测并提取骨骼关键点;然后,使用在专业网球场采集的视频数据集进行PoseC3D模型训练,使模型能够对网球的子动作进行分类;之后,使用动态时间规整算法对分类的动作进行评价;最后,基于采集的视频数据集进行了大量实验。结果表明,提出的基于PoseC3D的网球动作识别方法对6类网球子动作的分类Top1准确率可以达到90.8%。相较于基于图卷积网络的方法,比如AGCN和ST-GCN,具有更强的泛化能力;提出的基于动态时间规整的评分算法能够在动作分类后实时、准确地给出相应动作的评价分数,从而减少了网球教师的工作强度,有效地提升了网球教学质量。

关键词: 模式识别, 姿态估计, 动作识别, 卷积神经网络, 动态时间规整

Abstract: To accurately recognize and evaluate tennis actions, by combining computer vision with tennis related knowledge, this paper proposes a tennis action recognition and evaluation method based on PoseC3D. Firstly, a pose estimation model based on resnet-50 is used to detect human targets in tennis video and extract bone key points. Secondly, the PoseC3D model is trained through the video data set collected in the professional tennis court, so that it can classify the sub actions of tennis. Thirdly, the dynamic time warping algorithm is used to evaluate the classified actions. Finally, based on the collected video data set, a large number of experiments are carried out. The results show that the Top1 accuracy of the proposed tennis action recognition method based on PoseC3D can reach 90.8%. Compared with the methods based on graph convolution network, such as AGCN and ST-GCN, it has stronger generalization ability. Moreover, the proposed scoring algorithm based on dynamic time warping can give real-time and accurate evaluation scores for corresponding actions after action classification, reducing the work intensity of tennis teachers and effectively improving the quality of tennis teaching.

Key words: pattern recognition, pose estimation, action recognition, convolutional neural network, dynamic time warping