Computer Architecture: A Quantitative Approach [Must read]
Streaming Systems [Book]
Kubernetes in Action (start to read) [Book]
视频
A New Golden Age for Computer Architecture History, Challenges, and Opportunities. David Patterson [YouTube]
How to Have a Bad Career. David Patterson (I am a big fan) [YouTube]
SysML 18: Perspectives and Challenges. Michael Jordan [YouTube]
SysML 18: Systems and Machine Learning Symbiosis. Jeff Dean [YouTube]
课程
CS294: AI For Systems and Systems For AI. [UC Berkeley] (Strong Recommendation)
CSE 599W: System for ML. [Chen Tianqi] [University of Washington]
CSE 291F: Advanced Data Analytics and ML Systems. [UCSD]
CSci 8980: Machine Learning in Computer Systems [University of Minnesota, Twin Cities]
调查
Hidden technical debt in machine learning systems [Paper]
Sculley, David, et al. (NIPS 2015)
Summary:
End-to-end arguments in system design [Paper]
Saltzer, Jerome H., David P. Reed, and David D. Clark.
System Design for Large Scale Machine Learning [Thesis]
Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications [Paper]
Park, Jongsoo, Maxim Naumov, Protonu Basu et al. arXiv 2018
Summary: This paper presents a characterizations of DL models and then shows the new design principle of DL hardware.
有用的工具
Intel® VTune™ Amplifier [Website]
Stop guessing why software is slow. Advanced sampling and profiling techniques quickly analyze your code, isolate issues, and deliver insights for optimizing performance on modern processors
NVIDIA DALI [GitHub]
A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications
gpushare-scheduler-extender [GitHub]
Some of these tasks can be run on the same Nvidia GPU device to increase GPU utilization
TensorRT [NVIDIA]
It is designed to work in a complementary fashion with training frameworks such as TensorFlow, Caffe, PyTorch, MXNet, etc. It focuses specifically on running an already trained network quickly and efficiently on a GPU for the purpose of generating a result