计算机工程与科学 ›› 2023, Vol. 45 ›› Issue (02): 191-203.
董佩杰,牛新,魏自勉,陈学晖
收稿日期:
2021-11-29
修回日期:
2022-05-26
接受日期:
2023-02-25
出版日期:
2023-02-25
发布日期:
2023-02-15
基金资助:
DONG Pei-jie,NIU Xin,WEI Zi-mian,CHEN Xue-hui
Received:
2021-11-29
Revised:
2022-05-26
Accepted:
2023-02-25
Online:
2023-02-25
Published:
2023-02-15
摘要: 深度学习技术的快速发展与神经网络结构的创新关系密切。为提升网络结构设计效率,自动化网络结构设计算法—神经网络结构搜索NAS成为近年的研究热点。早期NAS算法通常要对大量候选网络进行训练和评估,带来了巨大的计算开销。通过迁移学习技术,可以加速候选网络的收敛,从而提升网络结构搜索效率。基于权重迁移技术的单次神经网络结构搜索(One-shot NAS)算法以超图为基础,子图之间进行权重共享,提高了搜索效率,但是也面临着协同适应、排序相关性差等挑战性问题。首先介绍了基于权重共享的One-shot NAS算法的相关研究,然后从采样策略、过程解耦和阶段性3个方面对关键技术进行分析梳理,比较分析了典型算法的搜索效果,并对未来的研究方向进行了展望。
董佩杰, 牛新, 魏自勉, 陈学晖. 单次神经网络结构搜索研究综述[J]. 计算机工程与科学, 2023, 45(02): 191-203.
DONG Pei-jie, NIU Xin, WEI Zi-mian, CHEN Xue-hui. Review of one-shot neural architecture search[J]. Computer Engineering & Science, 2023, 45(02): 191-203.
[1] | Krizhevsky A,Sutskekver I,Hinton G E.ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017,60(6):84-90. |
[2] | Deng J, Dong W, Socher R, et al. ImageNet:A large-scale hierarchical image database[C]∥Proc of the IEEE Conference on Computer Vision and Pattern Recognition,2009:248-255. |
[3] | Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition [J].arXiv:1409.1556,2014. |
[4] | Szegedy C, Liu W,Jia Y,et al.Going deeper with convolutions[C]∥Proc of the IEEE Conference on Computer Vision and Pattern Recognition,2015:1-9. |
[5] | He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C]∥Proc of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778. |
[6] | Howard A G,Zhu M L,Chen B,et al.MobileNets:Efficient convolutional neural networks for mobile vision applications [J].arXiv:1704.04861,2017. |
[7] | Yu F,Wang D,Darrell T.Deep layer aggregation [C]∥Proc of IEEE/CVF Conference on Computer Vision and Pattern Recognition,2018:2403-2412. |
[8] | Hu J,Shen L,Albanie S,et al.Squeeze and excitation networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023. |
[9] | Real E,Aggarwal A,Huang Y P,et al.Regularized evolution for image classifier architecture search[C]∥Proc of the 33rd AAAI Conference on Artificial Intelligence,2019:4780-4789. |
[10] | Zoph B,Le Q V.Neural architecture search with reinforcement learning [J].arXiv:1611.01578,2016. |
[11] | Zoph B,Vasudevan V,Shlens J,et al.Learning transferable architectures for scalable image recognition [C]∥Proc of IEEE/CVF Conference on Computer Vision and Pattern Recognition,2017:8697-8710. |
[12] | Liu H,Simonyan K,Yang Y.DARTS:Differentiable architecture search [J].arXiv:1806.09055,2019. |
[13] | Chu X,Zhou T,Zhang B,et al.Fair DARTS:Eliminating unfair advantages in differentiable architecture search[C]∥Proc of the European Conference on Computer Vision,2020:465-480. |
[14] | Liang H,Zhang S,Sun J,et al.DARTS+:Improved differentiable architecture search with early stopping [J].arXiv:1909.06035,2019. |
[15] | Williams R J.Simple statistical gradient-following algorithms for connectionist reinforcement learning [J].Machine Learning,1992,8(3):229-256. |
[16] | Zheng X,Ji R,Tang L,et al.Multinomial distribution learning for effective neural architecture search[C]∥Proc of the IEEE/CVF International Conference on Computer Vision,2019:1304-1313. |
[17] | Liu C,Zoph B,Neumann M,et al.Progressive neural architecture search[C]∥Proc of the European Conference on Computer Vision, 2018:19-34. |
[18] | Pham H,Guan M,Zoph B,et al.Efficient neural architecture search via parameters sharing[C]∥Proc of the 35th International Conference on Machine Learning,2018:4095-4104. |
[19] | Guo Z,Zhang X,Mu H,et al.Single path one-shot neural architecture search with uniform sampling[C]∥Proc of the European Conference on Computer Vision, 2020:544-560. |
[20] | Yang A, Esperan P M,Carlucci F M.NAS evaluation is frustratingly hard [J].arXiv:1912.12522, 2019. |
[21] | Yu K,Sciuto C,Jaggi M,et al.Evaluating the search phase of neural architecture search [J].arXiv:1902.08142, 2019. |
[22] | Zela A,Elsken T,Saikia T,et al.Understanding and robustifying differentiable architecture search [J].arXiv:1909.09656, 2019. |
[23] | Xie L,Chen X,Bi K,et al.Weight-sharing neural architecture search:A battle to shrink the optimization gap [J].arXiv:2008.01475,2020. |
[24] | He X,Zhao K Y,Chu X W.AutoML:A survey of the state-of-the-art [J].Knowledge-Based Systems,2021,212:106622.1-106622.7. |
[25] | Tang Lang, Li Hui-xia, Yan Chen-qian,et al.Survey on neural architecture search[J].Journal of Image and Graphics, 2021,26(2):245-264.(in Chinese) |
[26] | Geng Fei, Wang Chun-nan, Wang Hong-zhi. Neural architecture search:A survey[J].Intelligent Computer and Applications, 2020,10(6):25-30.(in Chinese) |
[27] | Li F F, Fergus R,Perona P.One-shot learning of object categories [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(4):594-611. |
[28] | Wei T,Wang C H,Rui Y,et al.Network morphism[C]∥Proc of the 33rd International Conference on Machine Learning,2016:564-572. |
[29] | Bender G,Kindermans P-J,Zoph B,et al.Understanding and simplifying one-shot architecture search[C]∥Proc of the 35th International Conference on Machine Learning,2018:550-559. |
[30] | Brock A,Lim T,Ritchie J M,et al.SMASH:One-shot model architecture search through hyperNetworks[C]∥Proc of the 6th International Conference on Learning Representations,2018:1. |
[31] | Ha D,Dai A M,Le Q V.HyperNetworks [C]∥Proc of the 5th International Conference on Learning Representations,2017:1-18. |
[32] | Stamoulis D,Ding R Z,Wang D,et al.Single-path NAS:Designing hardware-efficient convNets in less than 4 hours[C]∥Proc of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases,2019:481-497. |
[33] | Cai H,Gan C,Wang T,et al.Once for all:Train one network and specialize it for efficient deployment[J].arXiv:1908.09791, 2019. |
[34] | Yu J,Yang L,Xu N,et al.Slimmable neural networks [C]∥Proc of the 7th International Conference on Learning Representations Poster,2019:1. |
[35] | Yu J H,Huang T. Universally slimmable networks and improved training techniques [C]∥Proc of 2019 IEEE/CVF International Conference on Computer Vision,2019:1803-1811. |
[36] | Yu J,Huang T.AutoSlim:Towards one-shot architecture search for channel numbers [C]∥Proc of Conference and Workshop on Neural Information Processing Systems,2019:1-13. |
[37] | Wan A,Dai X,Zhang P,et al.FBNetv2:Differentiable neural architecture search for spatial and channel dimensions[C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:12965-12974. |
[38] | Yu K, Ranftl R,Salzmann M.Landmark regularization:Ranking guided super-net training in neural architecture search [C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2021:13718-13727. |
[39] | Xie S,Zheng H,Liu C,et al.SNAS:Stochastic neural architecture search [J].arXiv:1812.09926, 2018. |
[40] | Dong X,Yang Y.Searching for a robust neural architecture in four GPU hours[C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019:1761-1770. |
[41] | Wu B,Dai X,Zhang P,et al.FBNet:Hardware-aware efficient ConvNet design via differentiable neural architecture search[C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2019:10734-10742. |
[42] | Zhang X B, Chang J L,Guo Y W,et al.DATA:Differenti- able architecture approximation with distribution guided sampling [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(9):2905-2920. |
[43] | Gumbel E J.Statistical theory of extreme values and some practical applications[J].Journal of the Royal Statistical Society Series A, 1955,118(1):2342529. |
[44] | Xu Y,Xie L,Zhang X,et al.PC-DARTS:Partial channel connections for memory-efficient architecture search[J]. arXiv:1907.05737, 2019. |
[45] | Krizhevsky A.Learning multiple layers of features from tiny images [R].Toronto:University of Toronto,2009. |
[46] | Chen X,Xie L,Wu J,et al.Progressive differentiable architecture search:Bridging the depth gap between search and evaluation[C]∥Proc of the IEEE International Conference on Computer Vision,2019:1294-1303. |
[47] | Cai H,Zhu L,Han S.ProxylessNAS:Direct neural architecture search on target task and hardware[C]∥Proc of the International Conference on Learning Representations,2019:1. |
[48] | Chu X,Wang X,Zhang B,et al.DARTS:Robustly stepping out of performance collapse without indicators [J].arXiv:2009.01027,2020. |
[49] | Li L,Talwalkar A.Random search and reproducibility for neural architecture search[C]∥Proc of the Uncertainty in Artificial Intelligence,2020:367-377. |
[50] | Chu X,Zhang B,Xu R,et al.FairNAS:Rethinking evaluation fairness of weight sharing neural architecture search [C]∥Proc of the IEEE/CVF International Conference on Computer Vision,2021:12239-12248. |
[51] | Veniat T,Denoyer L.Learning time/memory-efficient deep architectures with budgeted super networks[C]∥Proc of the IEEE Conference on Computer Vision and Pattern Recognition,2018:3492-3500. |
[52] | Silver D,Lever G,Heess N,et al.Deterministic policy gradient algorithms[C]∥Proc of the 31st International Conference on Machine Learning,2014:387-395. |
[53] | Negrinho R,Gordon G J.DeepArchitect:Automatically designing and training deep architectures [J].arXiv:1704.08792,2017. |
[54] | Wang L N,Zhao Y Y,Jinnai Y,et al.Alphax:Exploring neural architectures with deep neural networks and Monte Carlo tree search [J].arXiv:1903.11059,2019. |
[55] | Wang L N,Xie S N,Li T,et al.Sample-efficient neural architecture search by learning action space [J].arXiv:1906.06832,2019. |
[56] | Wang L,Fonseca R,Tian Y.Learning search space partition for black-box optimization using Monte Carlo tree search [C]∥Proc of Conference and Workshop on Neural Information Processing Systems,2020:19511-19522. |
[57] | You S,Huang T,Yang M,et al.GreedyNAS:Towards fast one-shot NAS with greedy supernet[C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:1999-2008. |
[58] | Hong W,Li G,Zhang W,et al.DropNAS:Grouped operation dropout for differentiable architecture search[J].arXiv:2201.11679,2022. |
[59] | Li G,Zhang X,Wang Z,et al.StacNAS:Towards stable and consistent optimization for differentiable neural architecture search [J].arXiv:1909.11926,2019. |
[60] | Luo R,Qin T,Chen E.Balanced one-shot neural architecture optimization [J].arXiv:1909.10815,2019. |
[61] | Chen B, Li P, Li B,et al.BN-NAS:Neural architecture search with batch normalization [J].arXiv:2108.07375,2021. |
[62] | Laube K A,Zell A.Inter-choice dependent super-network weights [J].arXiv:2104.11522,2021. |
[63] | Chu X,Zhang B,Li J,et al.SCARLET-NAS:Bridging the gap between stability and scalability in weight-sharing neural architecture search [C]∥Proc of IEEE/CVF International Conference on Computer Vision,2021:317-325. |
[64] | Zhao Y,Wang L,Tian Y,et al.Few-shot neural architecture search[C]∥Proc of the 38th International Conference on Machine Learning,2021:12707-12718. |
[65] | Su X, You S, Zheng M, et al. K-shot NAS:Learnable weight-sharing for NAS with K-shot supernets [C]∥Proc of the International Conference on Machine Learning,2021:9880-9890. |
[66] | Yu J,Jin P,Liu H,et al.BigNAS:Scaling up neural architecture search with big single-stage models[C]∥Proc of the European Conference on Computer Vision,2020:702-717. |
[67] | Mei J R,Li Y W,Lian X C,et al.AtomNAS:Fine-grained end-to-end neural architecture search[J]. arXiv:1912.09640, 2019. |
[68] | Ying C,Klein A,Christiansen E,et al.NAS-Bench-101:Towards reproducible neural architecture search[C]∥Proc of the International Conference on Machine Learning, 2019:7105-7114. |
[69] | Hu Y, Liang Y,Guo Z,et al.Angle-based search space shrinking for neural architecture search[C]∥Proc of the European Conference on Computer Vision,2020:119-134. |
[70] | Zhang X,Hou P,Zhang X,et al.Neural architecture search with random labels[C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021:10907-10916. |
[71] | Mellor J, Turner J,Storkey A,et al.Neural architecture search without training[C]∥Proc of the 38th International Conference on Machine Learning,2021:7588-7598. |
[72] | Chen W,Gong X,Wang Z.Neural architecture search on ImageNet in four GPU hours:A theoretically inspired perspective[J]. arXiv:2102.11535, 2021. |
[73] | Sahni M,Varshini S,Khare A,et al.CompOFA:Compound once-for-all networks for faster multi-platform deployment[J].arXiv:2104.12642,2021. |
[74] | Hu S,Xie S,Zheng H,et al.DSNAS:Direct neural architecture search without parameter retraining[C]∥Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:12084-12092. |
[75] | Krizhevsky A, Nair V,Hinton G.CIFAR-100[EB/OL].[2020-11-12]. http:∥www.cs.toronto.edu/kriz/cifar.html. |
[76] | Dosovitskiy A,Beyer L,Kolesnikov A,et al.An image is worth 16×16 words:Transformers for image recognition at scale [J]. arXiv:2010.11929, 2020. |
[77] | Chen B,Li P,Li C,et al.GLiT:Neural architecture search for global and local image transformer [C]∥Proc of the IEEE/CVF International Conference on Computer Vision,2021:12-21. |
[78] | Li C, Tang T, Wang G, et al. BossNAS:Exploring hybrid CNN-transformers with block-wisely self-supervised neural architecture search [C]∥Proc of International Conference on Computer Vision,2021:12281-12291. |
附中文参考文献: | |
[25] | 唐浪,李慧霞,颜晨倩,等.深度神经网络结构搜索综述 [J].中国图象图形学报,2021,26(2):245-264. |
[26] | 耿飞,王春楠,王宏志.神经网络架构搜索综述 [J].智能计算机与应用,2020,10(6):25-30. |
[1] | 马思远, 焦佳辉, 任晟岐, 宋伟. 基于注意力机制的城市多元空气质量数据缺失值填充[J]. 计算机工程与科学, 2023, 45(08): 1354-1364. |
[2] | 邓姗姗, 黄慧, 马燕. 基于改进Faster R-CNN的小目标检测算法[J]. 计算机工程与科学, 2023, 45(05): 869-877. |
[3] | 史册, 南新元. 改进InceptionV3与迁移学习的太阳能电池板缺陷识别[J]. 计算机工程与科学, 2023, 45(04): 646-653. |
[4] | 刘从军, 徐佳陈, 肖志勇, 柴志雷. 基于深度学习的心脏核磁共振图像自动分割算法[J]. 计算机工程与科学, 2022, 44(09): 1646-1654. |
[5] | 何涛, 施慧莉, 李大亮. 基于深度学习的SAR目标识别DSP设计[J]. 计算机工程与科学, 2022, 44(08): 1357-1363. |
[6] | 苟淞, 赵绪言, 侯松, 李威. 基于多尺度优化感知网络的口罩检测方法[J]. 计算机工程与科学, 2022, 44(08): 1440-1448. |
[7] | 刘李漫, 谭龙雨, 彭源, 刘佳. 基于全融合网络的三维点云语义分割[J]. 计算机工程与科学, 2022, 44(05): 862-869. |
[8] | 卢凯良. 基于可见光视觉图像的路面裂缝识别深度学习方法述评[J]. 计算机工程与科学, 2022, 44(04): 674-685. |
[9] | 刘云, 郑文凤, 张轶. 代价约束算法对入侵检测特征提取的优化研究[J]. 计算机工程与科学, 2022, 44(03): 447-453. |
[10] | 纪玲玉, 高永彬, 蔡清萍, 卫子然, 廖薇 . 基于改进卷积神经网络的腹部动脉血管分割[J]. 计算机工程与科学, 2021, 43(11): 1986-199. |
[11] | 宋超, 王斌, 许家佗. 基于深度迁移学习的舌象特征分类方法研究[J]. 计算机工程与科学, 2021, 43(08): 1488-1496. |
[12] | 陈诚1,郭卫斌1,李庆瑜2. 结合自注意力的对抗性领域适应图像分类方法[J]. 计算机工程与科学, 2020, 42(02): 259-265. |
[13] | 许浩,郭卫斌. 带有双判别器的对抗性领域适应图像分类算法[J]. 计算机工程与科学, 2019, 41(09): 1656-1661. |
[14] | 付玉香1,秦永彬1,2,申国伟1,2. 基于迁移学习的多源数据隐私保护方法研究[J]. 计算机工程与科学, 2019, 41(04): 641-648. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||
湘公网安备 43010502000083号
湘ICP备10006030号
版权所有 © 《计算机工程与科学》 编辑部
地址:中国湖南省长沙市开福区德雅路109号(410073) 电话:0731-87002567 Email: jsjgcykx@vip.163.com
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn