Computer Engineering & Science ›› 2025, Vol. 47 ›› Issue (4): 655-666.
• Computer Network and Znformation Security • Previous Articles Next Articles
GONG Haocheng,ZHU Hai,HUANG Zifei,YANG Mingze,ZHANG Kaiyu,WU Fei
Received:
Revised:
Online:
Published:
Abstract: With the rapid development of artificial intelligence and wireless sensing technologies, WiFi gesture recognition has emerged as one of the research areas attracting significant attention. Current research efforts aim to enhance the robustness of models across different data domains and reduce the reliance on retraining by extracting domain-independent features from channel state information (CSI) and proposing the body coordinate velocity profile (BVP). This enables high accuracy in both intra-domain and cross-domain recognition. However, in practical scenarios, converting collected CSI signals into BVP requires substantial computational resources, falling short of meeting the real-time and scalability requirements in production environments. Additionally, traditional models lack the capability to capture global features and long-term dependencies when dealing with large and complex datasets. To address these issues, a representation knowledge distillation-based WiFi gesture recognition (RKD-WGR) framework is proposed. RKD-WGR utilizes BVP data as input for the teacher model to guide the student model, which uses CSI data as input. This integrates the BVP inference capability into the student model while allowing CSI to learn from itself to complement information missing from BVP. Meanwhile, to improve recognition performance and strengthen the knowledge transfer from the teacher model to the student model, a 3D WiFi Transformer (3DWiT) is introduced as the teacher model. It leverages the spatio-temporal information of BVP to assist the teacher model in acquiring more information and enhancing its knowledge transfer capability. Experimental results on Widar 3.0 dataset demonstrate that, without using BVP and solely relying on CSI, the accuracy for six gesture classes reach 97.1%, for ten gesture classes it is 96.5%, and for 22 gesture classes it achieves 89.5%. These results validate the effectiveness of the proposed framework and model.
Key words: WiFi, channel state information (CSI), gesture recognition, knowledge distillation, Vision Transformer
GONG Haocheng, ZHU Hai, HUANG Zifei, YANG Mingze, ZHANG Kaiyu, WU Fei. A representation knowledge distillation-based WiFi gesture recognition method[J]. Computer Engineering & Science, 2025, 47(4): 655-666.
0 / / Recommend
Add to citation manager EndNote|Ris|BibTeX
URL: http://joces.nudt.edu.cn/EN/
http://joces.nudt.edu.cn/EN/Y2025/V47/I4/655