Computer Engineering & Science ›› 2023, Vol. 45 ›› Issue (12): 2226-2236.
• Artificial Intelligence and Data Mining • Previous Articles Next Articles
JIAO Jia-hui1,2,MA Si-yuan1,2,SONG Yu2,SONG Wei1
Received:
2022-08-12
Revised:
2022-11-14
Accepted:
2023-12-25
Online:
2023-12-25
Published:
2023-12-14
JIAO Jia-hui, MA Si-yuan, SONG Yu, SONG Wei. Bi-modal music genre classification model MGTN based on convolutional attention mechanism[J]. Computer Engineering & Science, 2023, 45(12): 2226-2236.
[1] | Tzanetakis G,Cook P. Musical genre classification of audio signals[J].IEEE Transactions on Speech and Audio Processing,2002,10(5):293-302. |
[2] | Li T,Ogihara M,Li Q.A comparative study on content-based music genre classification[C]∥Proc of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval,2003:282-289. |
[3] | Allamy S, Koerich A L. 1D CNN architectures for music genre classification[C]∥Proc of 2021 IEEE Symposium Series on Computational Intelligence,2021:1-7. |
[4] | Choi K,Fazekas G,Sandler M,et al.Convolutional recurrent neural networks for music classification[C]∥Proc of 2017 IEEE International Conference on Acoustics,Speech and Signal Processing,2017:2392-2396. |
[5] | Sigtia S,Dixon S.Improved music feature learning with deep neural networks[C]∥Proc of 2014 IEEE International Conference on Acoustics,Speech and Signal Processing,2014:6959-6963. |
[6] | Fulzele P,Singh R,Kaushik N,et al.A hybrid model for music genre classification using LSTM and SVM[C]∥Proc of 2018 11th International Conference on Contemporary Computing,2018:1-3. |
[7] | Auguin N,Huang S,Fung P.Identification of live or studio versions of a song via supervised learning[C]∥Proc of Signal and Information Processing Association Annual Summit and Conference,2013:1-4. |
[8] | Bergstra J,Casagrande N,Erhan D,et al.Aggregate features and AdaBoost for music classification[J].Machine Learning,2006,65:473-484. |
[9] | Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C]∥Proc of the 31st International Conference on Neural Information Processing Systems,2017:6000-6010. |
[10] | Li S,Jin X,Xuan Y,et al.Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting[C]∥Proc of the 33rd Conference on Neural Information Processing Systems,2019:5243-5253. |
[11] | Dosovitskiy A,Beyer L,Kolesnikov A,et al.An image is worth 16×16 words:Transformers for image recognition at scale[J].arXiv:2010.11929v2,2020. |
[12] | Tang Y, Xu J,Matsumoto K,et al.Sequence-to-sequence model with attention for time series classification[C]∥Proc of 2016 IEEE 16th International Conference on Data Mining Workshops,2016:503-510. |
[13] | Wang Teng,Jiao Xue-wei,Gao Yang.An anomaly detection algorithm based on attention-GRU and iForest for periodic time series[J].Computer Engineering & Science,2019,41(12):2217-2222.(in Chinese) |
[14] | Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90. |
[15] | Liu C,Feng L,Liu G,et al.Bottom-up broadcast neural network for music genre classification[J].Multimedia Tools and Applications,2021,80(5):7313-7331. |
[16] | Srivastava N, Hinton G,Krizhevsky A,et al.Dropout:A simple way to prevent neural networks from overfitting [J].Journal of Machine Learning Research,2014,15(1):1929-1958. |
[17] | Cano Vila P, Gómez Gutiérrez E, Gouyon F, et al. ISMIR 2004 audio description contest:MTG-TR-2006-02[R].Barcelona:Music Technology Group, Universitat Pompeu Fabra,2006. |
[18] | Algorithms to measure audio programme loudness and true-peak audio level[EB/OL]. [2022-05-06]. https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1770-3-201208-S!!PDF-E.pdf. |
[19] | Senac C,Pellegrini T,Mouret F,et al.Music feature maps with convolutional neural networks for music genre classification[C]∥Proc of the 15th International Workshop on Content-Based Multimedia Indexing,2017:1-5. |
[20] | Ndou N, Ajoodha R, Jadhav A. Music genre classification:A review of deep-learning and traditional machine-learning approaches[C]∥Proc of 2021 IEEE International IOT,Electronics and Mechatronics Conference,2021:1-6. |
[21] | Sturm B L.An introduction to audio content analysis:Applications in signal processing and music informatics[M].New York:Wiley-IEEE Press,2012. |
[22] | McFee B,Raffel C,Liang D,et al.librosa:Audio and music signal analysis in Python[C]∥Proc of the 14th Python in Science Conference,2015:18-25. |
[23] | Davis S, Mermelstein P.Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences[J].IEEE Transactions on Acoustics, Speech, and Signal Processing, 1980, 28(4):357-366. |
[24] | Bahuleyan H. Music genre classification using machine learning techniques[J].arXiv:1804.01149,2018. |
[25] | Zhuang Y,Chen Y,Zheng J.Music genre classification with transformer classifier[C]∥Proc of 2020 4th International Conference on Digital Signal Processing,2020:155-159. |
[26] | Freitag M,Amiriparian S,Pugachevskiy S,et al.auDeep:Unsupervised learning of representations from audio with deep recurrent neural networks[J].The Journal of Machine Learning Research,2017,18(1):6340-6344. |
[27] | Zhang W,Lei W,Xu X,et al.Improved music genre classification with convolutional neural networks[C]∥Proc of the Conference of the International Speech Communication Association,2016:3304-3308. |
[28] | Yu Y,Luo S,Liu S,et al.Deep attention based music genre classification[J].Neurocomputing,2020,372:84-91. |
[29] | Senac C,Pellegrini T,Mouret F,et al.Music feature maps with convolutional neural networks for music genre classification[C]∥Proc of the 15th International Workshop on Content-Based Multimedia Indexing,2017:1-5. |
[30] | Pons J,Lidy T,Serra X.Experimenting with musically motivated convolutional neural networks[C]∥Proc of 2016 14th International Workshop on Content-Based Multimedia Indexing,2016:1-6. |
[31] | Medhat F,Chesmore D,Robinson J.Automatic classification of music genre using masked conditional neural networks[C]∥Proc of 2017 IEEE International Conference on Data Mining,2017:979-984. |
[32] | Pons J,Serra X.Designing efficient architectures for modeling temporal features with convolutional neural networks[C]∥Proc of 2017 IEEE International Conference on Acoustics,Speech and Signal Processing,2017:2472-2476. |
[33] | Marchand U,Peeters G.The modulation scale spectrum and its application to rhythm-content description[C]∥Proc of the 17th International Conference on Digital Audio Effects,2014:167-172. |
附中文参考文献: | |
[13] | 王腾,焦学伟,高阳.一种基于Attention-GRU和iForest的周期性时间序列异常检测算法[J].计算机工程与科学,2019,41(12):2217-2222. |
[1] | LIANG Xiu-man, ZHOU Jia-run, YANG Ruo-lan. LPD-YOLO:Lightweight obscured pedestrian detection model [J]. Computer Engineering & Science, 2023, 45(12): 2197-2205. |
[2] | JIA Kang, LI Xiao-nan, LI Guan-yu. A graph similarity computation model based on adaptive structure aware pooling graph matching [J]. Computer Engineering & Science, 2023, 45(11): 1999-2007. |
[3] | YIN Chun-yong, FENG Meng-xue. A semi-supervised log anomaly detection method based on attention mechanism [J]. Computer Engineering & Science, 2023, 45(08): 1405-1415. |
[4] | YU Zi-cheng, LING Jie. A DGA domain name detection method based on Transformer and multi-feature fusion [J]. Computer Engineering & Science, 2023, 45(08): 1416-1423. |
[5] | WANG Jian, JIANG Lin, WANG Lin-qin, YU Zheng-tao, ZHANG Song, GAO Sheng-xiang, . A low-resource Lao text regularization task based on BiLSTM [J]. Computer Engineering & Science, 2023, 45(07): 1292-1299. |
[6] | PU Zi-jun, ZHANG Shou-ming. A sound event localization and detection algorithm based on feature fusion and Transformer model [J]. Computer Engineering & Science, 2023, 45(06): 1097-1105. |
[7] | WANG Yang, CHEN Zhi-bin. A dynamic graph transformer model for solving CVRP [J]. Computer Engineering & Science, 2023, 45(05): 859-868. |
[8] | YUAN Ye, LIAO Wei. A text similarity calculation method based on multiple related information interaction [J]. Computer Engineering & Science, 2022, 44(07): 1313-1320. |
[9] | ZHANG Yu-jie, ZHANG Zan. Application of DenseNet in voiceprint recognition [J]. Computer Engineering & Science, 2022, 44(01): 132-137. |
[10] | WU Xiang-ning, HE Peng, DENG Zhong-gang, LI Jia-qi, WANG Wen, CHEN Miao. A deep learning model of small object detection based on attention mechanism [J]. Computer Engineering & Science, 2021, 43(01): 95-104. |
[11] |
ZHANG Xin,CHENG Hua,FANG Yi-quan.
A DGA domain name detection method based on Transformer
[J]. Computer Engineering & Science, 2020, 42(03): 411-417.
|
[12] |
ZHANG Xiaolong1,2,3,PENG Yi1,2,3.
An audio recognition method based on
residual network and random forest
[J]. Computer Engineering & Science, 2019, 41(04): 727-732.
|
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||
湘公网安备 43010502000083号
湘ICP备10006030号
Copyright © Computer Engineering & Science, All Rights Reserved.
Address:109 Deya Rd,Changsha,hunan(410073) Tel: 0731-87002567 Email: jsjgcykx@vip.163.com
Powered by Beijing Magtech Co., Ltd.