2017年计算机体系结构顶会深度学习/机器学习论文fulllist

2017年人工智能在国内获得快速发展,国家相继出台一系列支持人工智能发展的政策,各大科技企业也争相宣布其人工智能发展战略,资本更是对这一新兴领域极为倾心。作为新一轮产业变革的核心驱动力,中国的人工智能发展正在进入新阶段,而且中国有望成为引领全球人工智能发展的重要引擎。可以说人工智能在2017年的发展,不论是学术界,还是工业界,甚至资本领域,都远远超出了人们的预期。今天小编就给大家汇总一下过去的2017年,在计算机体系结构顶级会议上发表的与深度学习/机器学习相关的论文,以供研究参考之用。

ISCA2017,6篇,关键词TPU

[HW][ISCA2017]002-In-Datacenter Performance Analysis of a Tensor Processing Unit, Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, https://arxiv.org/pdf/1704.04760

[HW][ISCA2017]003-SCALEDEEP: A Scalable Compute Architecture for Learning and Evaluating Deep Networks, Swagath Venkataramani§, Ashish Ranjan§, Sasikanth Avancha‡, Ashok Jagannathan‡, Anand Raghunathan§, Subarno Banerjee‡, Dipankar Das‡, Ajaya Durg‡, Dheemanth Nagaraj‡, Bharat Kaul‡, and Pradeep Dubey‡ (§ School of ECE, Purdue University, ‡ Parallel Computing Lab, Intel Corporation)

[HW][ISCA2017]004-SCNN: An Accelerator for Compressed-sparse Convolutional Neural Networks, Angshuman Parashar†, Minsoo Rhu†, Anurag Mukkara‡, Antonio Puglielli∗, Rangharajan Venkatesan†, Brucek Khailany†, Joel Emer†‡, Stephen W. Keckler†, and William J. Dally†⋄ (NVIDIA† Massachusetts Institute of Technology‡ UC-Berkeley∗ Stanford University⋄)

[HW][ISCA2017]005-Maximizing CNN Accelerator Efficiency Through Resource Partitioning, Yongming Shen, Michael Ferdman, Peter Milder (Stony Brook University)

[HW][ISCA2017]006-Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism, Jiecao Yu1, Andrew Lukefahr1, David Palframan2, Ganesh Dasika2, Reetuparna Das1, Scott Mahlke1 (1 University of Michigan, 2 ARM)

[HW][ISCA2017]007-Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent, Christopher De Sa, Matthew Feldman, Christopher Ré, Kunle Olukotun (Departments of Electrical Engineering and Computer Science Stanford University)

HPCA2017,3篇

[HW][HPCA2017]008-PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning,Linghao Song, University of Pittsburgh,Xuehai Qian, University of Southern California,Hai Li, University of Pittsburgh,Yiran Chen, University of Pittsburgh

[HW][HPCA2017]008-FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks,Wenyan Lu, ICT,CAS,Guihai Yan, ICT,CAS,Jiajun Li, ICT,CAS,Shijun Gong, ICT,CAS,Yinhe Han, ICT,CAS,Xiaowei Li, ICT,CAS

[HW][HPCA2017]009-Towards Pervasive and User Satisfactory CNN across GPU Microarchitectures,Mingcong Song, University of Florida,Yang Hu, University of Florida,Huixiang Chen, University of Florida,Tao Li, NSF/University of Florida

FPGA2017,14篇,关键词ESE

[HW][FPGA2017]029-Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs,Ritchie Zhao1, Weinan Song1, Wentao Zhang1, Tianwei Xing2, Jeng-Hau Lin3, Mani Srivastava2, Rajesh Gupta3, Zhiru Zhang1;1Cornell University, 2UCLA, 3UCSD

[HW][FPGA2017]030-Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network,Jialiang Zhang and Jing Li,UW-Madison

[HW][FPGA2017]031-Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System,Chi Zhang and Viktor Prasanna,USC

[HW][FPGA2017]032-Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks, Yufei Ma, Yu Cao, Sarma Vrudhula, Jae-sun Seo,Arizona State University

[HW][FPGA2017]033-An OpenCL Deep Learning Accelerator on Arria 10 (Best Paper Candidate),Utku Aydonat, Shane O'Connell, Davor Capalija, Andrew Ling, Gordon Chiu,Intel

[HW][FPGA2017]034-FINN: A Framework for Fast, Scalable Binarized Neural Network Inference,Yaman Umuroglu1,2, Nicholas J. Fraser1,3, Giulio Gambardella1, Michaela Blott1, Philip Leong3, Magnus Jahre2, Kees Vissers1;1Xilinx Research Labs, 2Norwegian University of Science and Technology, 3University of Sydney

[HW][FPGA2017]035-ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA (Best Paper Award);Song Han1, Junlong Kang2, Huizi Mao1, Yiming Hu3, Xin Li2, Yubin Li2, Dongliang Xie2, Hong Luo2, Song Yao2, Yu Wang3, Huazhong Yang3, Bill Dally1;1Stanford University, 2DeePhi, 3Tsinghua University

[HW][FPGA2017]036-A Machine Learning Framework for FPGA Placement,Gary Grewal, Shawki Areibi, Matthew Westrik, Ziad Abuowaimer, Betty Zhao;University of Guelph

[HW][FPGA2017]037-Storage-Efficient Batching for Minimizing Bandwidth of Fully-Connected Neural Network Layers; Yongming Shen, Michael Ferdman, Peter Milder;Stony Brook University

[HW][FPGA2017]038-A Batch Normalization Free Binarized Convolutional Deep Neural Network on an FPGA;hiroki nakahara1, Haruyoshi Yonekawa1, Hisashi Iwamoto2, Masato Motomura3;1Tokyo Institute of Technology, 2Poco a Poco Networks, 3Hokkaido University

[HW][FPGA2017]039-A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks;Yixing Li1, Zichuan Liu2, Kai Xu1, Fengbo Ren1, Hao Yu2;1Arizona State University, 2Nanyang Technological University

[HW][FPGA2017]040-Stochastic-Based Multi-stage Streaming Realization of a Deep Convolutional Neural Network;Mingjie Lin1 and Mohammed Alawad2;1University of Central Florida, 2UCF

[HW][FPGA2017]041-fpgaConvNet: Automated Mapping of Convolutional Neural Networks on FPGAs;Stylianos Venieris and Christos Bouganis;Imperial College London

[HW][FPGA2017]042-Learning Convolutional Neural Networks for Data-Flow Graph Mapping on Spatial Programmable Architectures;Shouyi Yin, Dajiang Liu, Lifeng Sun, Xinhan Lin, Leibo Liu, Shaojun Wei;Tsinghua Universit

MICRO2017,6篇

[HW][MICRO2017]010-UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition,Reza Yazdani:Universitat Politecnica de Catalunya; Jose-Maria Arnau:Universitat Politecnica de Catalunya; Antonio Gonzalez:Universitat Politecnica de Catalunya

[HW][MICRO2017]011-IDEAL: Image DEnoising AcceLerator,Mostafa Mahmoud:University of Toronto; Bojian Zheng:University of Toronto; Alberto Delmas Lascorz:University of Toronto; Felix Heide: Stanford University/Algolux; Jonathan Assouline: Algolux; Paul Boucher: Algolux; Emmanuel Onzon: Algolux;Andreas Moshovos:University of Toronto

[HW][MICRO2017]012-Scale-Out Acceleration for Machine Learning,Jongse Park:Georgia Institute of Technology; Hardik Sharma:Georgia Institute of Technology; Divya Mahajan:Georgia Institute of Technology; Joon Kyung Kim:Georgia Institute of Technology; Hadi Esmaeilzadeh:University of California, San Diego

[HW][MICRO2017]013-Bit-Pragmatic Deep Neural Network Computing,Jorge Albericio:NVIDIA; Patrick Judd:University of Toronto; Alberto Delmas:University of Toronto; Sayeh Sharify:University of Toronto; Gerard O'Leary:University of Toronto; Roman Genov:University of Toronto; Andreas Moshovos:University of Toronto

[HW][MICRO2017]014-CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices,Caiwen Ding:Syracuse University; Siyu Liao:City University of New York. City College; Yanzhi Wang:Syracuse University;Zhe Li:Syracuse University; Ning Liu:Syracuse University; Youwei Zhuo:University of Southern California; Chao Wang:University of Southern California;Xuehai Qian:University of Southern California; Yu Bai:California State University Fullerton; Geng Yuan:Syracuse University; Xiaolong Ma:Syracuse University; Yipeng Zhang:Syracuse University; Jian Tang:Syracuse University; Qinru Qiu:Syracuse University; Xue Lin:Northeastern University; Bo Yuan:City University of New York. City College

[HW][MICRO2017]015-DeftNN: Addressing Bottlenecks for DNN Execution on GPUs via Synapse Vector Elimination and Near-compute Data Fission,Parker Hill:University of Michigan; Animesh Jain:University of Michigan; Mason Hill:University of Nevada, Las Vegas; Babak Zamirai:University of Michigan; Chang-Hong Hsu:University of Michigan; Michael Laurenzano:University of Michigan; Scott Mahlke:University of Michigan; Lingjia Tang:University of Michigan; Jason Mars:University of Michigan.

ISSCC2017,8篇

[HW][ISSCC2017]016-A 2.9TOPS/W Deep Convolutional Neural Network SoC in FD-SOI 28nm for Intelligent Embedded Systems,G. Desoli1, N. Chawla2, T. Boesch3, S-P. Singh2, E. Guidetti1, F. De Ambroggi4, T. Majo1,P. Zambotti4, M. Ayodhyawasi2, H. Singh2, N. Aggarwal2;1STMicroelectronics, Cornaredo, Italy; 2STMicroelectronics, Greater Noida, India;3STMicroelectronics, Geneva, Switzerland; 4STMicroelectronics, Agrate Brianza, Italy

[HW][ISSCC2017]017-DNPU: An 8.1TOPS/W Reconfigurable CNN-RNN Processor for General-Purpose Deep Neural Networks,D. Shin, J. Lee, J. Lee, H-J. Yoo, KAIST, Daejeon, Korea

[HW][ISSCC2017]018-A 28nm SoC with a 1.2GHz 568nJ/Prediction Sparse Deep-Neural-Network Engine with >0.1 Timing Error Rate Tolerance for IoT Applications,P. N. Whatmough, S. K. Lee, H. Lee, S. Rama, D. Brooks, G-Y. Wei;Harvard University, Cambridge, MA

[HW][ISSCC2017]019-A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating,M. Price, J. Glass, A. Chandrakasan,Massachusetts Institute of Technology, Cambridge, MA

[HW][ISSCC2017]020-ENVISION: A 0.26-to-10TOPS/W Subword-Parallel Computational Accuracy-Voltage-Frequency-Scalable Convolutional Neural Network Processor in 28nm FDSOI,B. Moons, R. Uytterhoeven, W. Dehaene, M. Verhelst, KU Leuven, Leuven, Belgium

[HW][ISSCC2017]021-A 0.62mW Ultra-Low-Power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-On Haar-Like Face Detector,K. Bong, S. Choi, C. Kim, S. Kang, Y. Kim, H-J. Yoo, KAIST, Daejeon, Korea

[HW][ISSCC2017]022-A 288μW Programmable Deep-Learning Processor with 270KB On-Chip Weight Storage Using Non-Uniform Memory Hierarchy for Mobile Intelligence,S. Bang1, J. Wang1, Z. Li1, C. Gao1, Y. Kim1,2, Q. Dong1, Y-P. Chen1, L. Fick1, X. Sun1,R. Dreslinski1, T. Mudge1, H. S. Kim1, D. Blaauw1, D. Sylvester1;1University of Michigan, Ann Arbor, MI; 2CubeWorks, Ann Arbor, MI

[HW][ISSCC2017]023-A 135mW Fully Integrated Data Processor for Next-Generation Sequencing,Y-C. Wu1, J-H. Hung2, C-H. Yang1,2, 1National Taiwan University, Taipei, Taiwan,2National Chiao Tung University, Hsinchu, Taiwan

HotChips2017,5篇,关键词XPU

[HW][HotChips2017]024-Microsoft Leverages FPGAs for Real-Time AI,https://www.designnews.com/electronics-test/microsoft-leverages-fpgas-real-time-ai/99071007157383

[HW][HotChips2017]025-Microsoft Unveils Real-Time AI for Azure,http://www.technewsworld.com/story/84762.html

[HW][HotChips2017]027-An Early Look at Baidu’s Custom AI and Analytics Processor, https://www.nextplatform.com/2017/08/22/first-look-baidus-custom-ai-analytics-processor/

[HW][HotChips2017]028-Microsoft Announces Project Brainwave To Take On Google’s AI Hardware Lead, https://www.forbes.com/sites/aarontilley/2017/08/22/microsoft-project-brainwave-ai-hardware-google/#6ce0745370df

最后,用ISSCC2017关于AI的一段开篇词,作为这篇文章的结束语,也是一个开放的思考:

Intelligent Machines: Will the Technological Singularity Happen?

Artificial intelligence (AI) will no doubt have a significant impact on society in the coming years. But how intelligent can a machine be? When artificially-general intelligence is capable of recursive self-improvement, a hypothetical ‘runaway effect’ — an intelligence explosion — might happen, yielding an intelligence surpassing all current human control or understanding. This event is known as the technological singularity; this is the point beyond which events may become unpredictable or even unfathomable to human intelligence. This panel will picture the current state of the art for AI, deep learning and robotics, and try to predict where this technology is heading.

欢迎讨论交流

研究领域包括集成电路、无线通信等,涉及深度学习加速器、SoC设计空间探索、多核任务调度和新一代无线通信系统实现;具有65nm、40nm成功流片经验,在研项目包括28nm、16nm等。

中国科学院自动化研究所国家专用集成设计工程技术研究中心

  • 发表于:
  • 原文链接http://kuaibao.qq.com/s/20171230A03OQV00?refer=cp_1026
  • 腾讯「云+社区」是腾讯内容开放平台帐号(企鹅号)传播渠道之一,根据《腾讯内容开放平台服务协议》转载发布内容。
  • 如有侵权,请联系 yunjia_community@tencent.com 删除。

扫码关注云+社区

领取腾讯云代金券

玩转腾讯云 有奖征文活动