• 中国计算机学会会刊
  • 中国科技核心期刊
  • 中文核心期刊

Computer Engineering & Science

Previous Articles     Next Articles

An audio recognition method based on
residual network and random forest

ZHANG Xiaolong1,2,3,PENG Yi1,2,3   

  1. (1.Hubei Key Laboratory of Intelligent Information Processing and RealTime Industrial System,Wuhan 430065;
    2.Institute of Big Data Science and Engineering,Wuhan University of Science and Technology,Wuhan 430065;
    3.School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065,China)
     
  • Received:2018-09-15 Revised:2018-11-22 Online:2019-04-25 Published:2019-04-25

Abstract:

Environmental sound classification (ESC) is one of the important branches in the field of audio processing, and it plays an important role in future multimedia applications. Audio recognition is the process of perceiving and understanding the surrounding environment by extracting the specific acoustic characteristics of the audio and classifying the audio into the correct scene corresponding to the sample. At present, audio recognition is mainly achieved through signal processing technology and machine learning methods. Along with the rapid development of artificial intelligence, traditional audio processing technology and machine learning methods are facing severe challenges. The recognition accuracy in ESC tasks remains to be further improved. We propose an audio recognition method which combines the residual network with random forest, and converts one-dimensional time domain signals of audio data into two-dimensional data in the form of MEL spectrograms. Pretraining the residual network can obtain a network model with high precision which is then used as a feature extractor. The network model is utilized to extract deep audio features and the random forest is used to classify the deep features. This method improves the recognition rate of ESC by nearly 10% and achieves better classification accuracy.

Key words: residual network, random forest, audio recognition, MEL spectrogram