• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science ›› 2024, Vol. 46 ›› Issue (06): 1072-1080.

• Graphics and Images • Previous Articles     Next Articles

Facial expression recognition based on network fusion to improve MobileViT

DENG Xiang-yu,PEI Hao-yuan,SHENG Ying   

  1. (College of Physics and Electronic Engineering,Northwest Normal University,Lanzhou 730070,China)
  • Received:2023-04-26 Revised:2023-10-13 Accepted:2024-06-25 Online:2024-06-25 Published:2024-06-18

Abstract: From the perspective of lightweight models, a facial expression recognition network based on network fusion to improve MobileViT is proposed. This network integrates multi-scale convolution PSConv and attention mechanisms through residual structures to form the RAPsconv feature reconstruction module. This module can more efficiently extract multi-scale features from a fine-grained perspective, enhancing the expression of key features, thereby improving the network's expressive ability and constructing an end-to-end facial expression recognition network. Additionally, to further narrow the gap between similar expressions, a loss function combining Softmax Loss and Center Loss is proposed, effectively reducing the misjudgment rate of expression recognition. Experimental results demonstrate that the improved network achieves higher accuracy on three natural scene expression datasets FER2013, FER+, and RAF-DB compared to the base network MobileViT, with accuracy improvements of 1.73%, 2.18%, and 1.64%, respectively. The improved network has fewer parameters, stronger robustness, and is suitable for lightweighting and integration, making it suitable for real-world applications in facial expression recognition.


Key words: facial expression recognition, MobileViT, multi-scale convolutional PSConv, attention mechanism, network fusion, lightweight network