AutoML for Mobile Compression and Acceleration on Mobile Devices

SIGAI学习与实践平台

发布于 2019-05-07 15:09:16

2.5K0

发布于 2019-05-07 15:09:16

文章被收录于专栏：SIGAI学习与实践平台

小编推荐：

第五期飞跃计划还有两个名额，联系小编，获取你的专属算法工程师学习计划（联系小编SIGAI_NO2）

1.网络裁枝原理

2.网络裁枝论文讲解

3.低秩估计的基本原理

4.网络压缩量化之低秩估计相关实践

5.网络压缩量化之参数量化原理，聚类编码，参数定点化

6.网络压缩量化之参数量化相关实践

7.网络压缩量化之模型蒸馏原理详解

8.网络压缩量化之模型蒸馏相关实践

图1 AMC的流程概览

图2 对action的约束

图3 不同裁枝策略对每层的压缩率

图4 不同裁枝策略的精度结果

图5 对ResNet50压缩，AMC与人工设定的对比

图6 AMC对ResNet50每层的压缩率

图7 AMC方法与启发式裁枝、手工设定裁枝方法的比较

图8 AMC与不同压缩方法的帕累托曲线

图9 AMC对MobileNet-V1网络，针对不同策略的压缩结果

参考文献

1.Anwar, S., Sung, W.: Compact deep convolutional neural networks with coarse pruning. arXiv preprint arXiv:1610.09639 (2016)

2.Ashok, A., Rhinehart, N., Beainy, F., Kitani, K.M.: N2n learning: Network to network compression via policy gradient reinforcement learning. arXiv preprint arXiv:1709.06030 (2017)

3.Bagherinezhad, H., Rastegari, M., Farhadi, A.: Lcnn: Lookup-based convolutional neural network. arXiv preprint arXiv:1611.06473 (2016)

4.Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016)

5. Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Smash: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:1708.05344 (2017)

6. Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Reinforcement learning for architecture search by network transformation. arXiv preprint arXiv:1707.04873(2017)

7. Canziani, A., Paszke, A., Culurciello, E.: An analysis of deep neural network models

for practical applications. arXiv preprint arXiv:1605.07678 (2016)

8. Chen, T., Goodfellow, I., Shlens, J.: Net2net: Accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015)

9. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. arXiv preprint arXiv:1610.02357 (2016)

10. Courbariaux, M., Bengio, Y.: Binarynet: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016)

11. Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Advances in Neural Information Processing Systems. pp. 1269–1277 (2014)

12. Dong, X., Huang, J., Yang, Y., Yan, S.: More is less: A more complicated network with less inference complexity. arXiv preprint arXiv:1703.08651 (2017)

13. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PAS- CAL Visual Object Classes Challenge 2007 (VOC2007) Results. http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

14. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1440–1448 (2015)

15. Gong, Y., Liu, L., Yang, M., Bourdev, L.: Compressing deep convolutional networks

using vector quantization. arXiv preprint arXiv:1412.6115 (2014)

16. Han, S.: Efficient methods and hardware for deep learning, https:

//stacks.stanford.edu/file/druid:qf934gh3708/EFFICIENT%20METHODS%

20AND%20HARDWARE%20FOR%20DEEP%20LEARNING-augmented.pdf

17. Han, S., Kang, J., Mao, H., Hu, Y., Li, X., Li, Y., Xie, D., Luo, H., Yao, S., Wang, Y., et al.: Ese: Efficient speech recognition engine with sparse lstm on fpga. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field- Programmable Gate Arrays. pp. 75–84. ACM (2017)

18. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M.A., Dally, W.J.: Eie: efficient inference engine on compressed deep neural network. In: Proceedings of the 43rd International Symposium on Computer Architecture. pp. 243–254. IEEE Press (2016)

19. Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural net- works with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015)16 Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li and Song Han

20. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems. pp. 1135–1143 (2015)

21. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 770–778 (2016)

22. He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1389–1397 (2017)

23. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

24. Hu, H., Peng, R., Tai, Y.W., Tang, C.K.: Network trimming: A data-driven neuron pruning approach towards efficient deep architectures. arXiv preprint arXiv:1607.03250 (2016)

25. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)

26. Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural net- works with low rank expansions. arXiv preprint arXiv:1405.3866 (2014)

27. Kim, Y.D., Park, E., Yoo, S., Choi, T., Yang, L., Shin, D.: Compression of deep convolutional neural networks for fast and low power mobile applications. arXiv preprint arXiv:1511.06530 (2015)

28. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)

29. Lavin, A.: Fast algorithms for convolutional neural networks. arXiv preprint arXiv:1509.09308 (2015)

30. Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv preprint arXiv:1412.6553 (2014)

31. Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016)

32. Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)

33. Lin, J., Rao, Y., Lu, J.: Runtime neural pruning. In: Advances in Neural Information Processing Systems. pp. 2178–2188 (2017)

34. Luo, J.H., Wu, J., Lin, W.: Thinet: A filter level pruning method for deep neural network compression. arXiv preprint arXiv:1707.06342 (2017)

35. Masana, M., van de Weijer, J., Herranz, L., Bagdanov, A.D., Alvarez, J.M.: Domain- adaptive deep network compression. In: The IEEE International Conference on Computer Vision (ICCV) (Oct 2017)

36.Mathieu,M.,Henaff,M.,LeCun,Y.:Fasttrainingofconvolutionalnetworksthrough ffts. arXiv preprint arXiv:1312.5851 (2013)

37. Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Navruzyan, A., Duffy, N., Hodjat, B.: Evolving deep neural networks. arXiv preprint arXiv:1703.00548 (2017)

38. Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient transfer learning. CoRR, abs/1611.06440(2016)AMC: AutoML for Model Compression and Acceleration on Mobile Devices 17

39. Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J., Keckler, S., Dally, W.J.: Scnn: An accelerator for compressed-sparse convolutional neural networks. In: 44th International Symposium on Computer Architecture (2017)

40. Polyak, A., Wolf, L.: Channel-level acceleration of deep face representations. IEEE Access 3, 2163–2175 (2015)

41. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: Xnor-net: Imagenet classi- fication using binary convolutional neural networks. In: European Conference on Computer Vision. pp. 525–542. Springer (2016)

42. Real, E., Moore, S., Selle, A., Saxena, S., Suematsu, Y.L., Le, Q., Kurakin, A.: Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041 (2017)

43. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detec- tion with region proposal networks. In: Advances in neural information processing systems. pp. 91–99 (2015)

44.Sandler,M.,Howard,A.,Zhu,M.,Zhmoginov,A.,Chen,L.C.:Invertedresidualsand linear bottlenecks: Mobile networks for classification, detection and segmentation. arXiv preprint arXiv:1801.04381 (2018)

45.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

46. Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evolutionary computation 10(2), 99–127 (2002)

47.Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1–9 (2015)

48.Vasilache, N., Johnson, J., Mathieu, M., Chintala, S., Piantino, S., LeCun, Y.: Fast convolutional nets with fbfft: A gpu performance evaluation. arXiv preprint arXiv:1412.7580 (2014)

49.Wang, H., Zhang, Q., Wang, Y., Hu, R.: Structured probabilistic pruning for deep convolutional neural network acceleration. arXiv preprint arXiv:1709.06994 (2017)

50.Watkins, C.J.C.H.: Learning from delayed rewards. Ph.D. thesis, King’s College,

Cambridge (1989)

51.Xue, J., Li, J., Gong, Y.: Restructuring of deep neural network acoustic models with singular value decomposition. In: INTERSPEECH. pp. 2365–2369 (2013)

52. Yang, T.J., Howard, A., Chen, B., Zhang, X., Go, A., Sze, V., Adam, H.: Netadapt: Platform-aware neural network adaptation for mobile applications. arXiv preprintarXiv:1804.03230 (2018)

53. Zhang, X., Zou, J., He, K., Sun, J.: Accelerating very deep convolutional networks for classification and detection. IEEE transactions on pattern analysis and machine intelligence 38(10), 1943–1955 (2016)

54. Zhong, Z., Yan, J., Liu, C.L.: Practical network blocks design with q-learning. arXiv preprint arXiv:1708.05552 (2017)

55. Zhu, C., Han, S., Mao, H., Dally, W.J.: Trained ternary quantization. arXiv preprint arXiv:1612.01064 (2016)

56. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)

57. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012 (2017)

本文为SIGAI原创

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2019-04-22，如有侵权请联系 cloudcommunity@tencent.com 删除

深度学习

本文分享自 SIGAI 微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

深度学习

登录后参与评论

0 条评论

热度

AutoML for Mobile Compression and Acceleration on Mobile Devices

AutoML for Mobile Compression and Acceleration on Mobile Devices

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐