社区首页 >专栏 >【陆勤践行】DataSchool 推荐的数据科学资源

【陆勤践行】DataSchool 推荐的数据科学资源

发布2018-02-26 10:58:53
发布2018-02-26 10:58:53


  • Simply Statistics1: Written by the Biostatistics professors at Johns Hopkins University who also run Coursera's Data Science Specialization
  • yhat's blog: Beginner-friendly content, usually in Python
  • No Free Hunch1 (Kaggle's blog): Mostly interviews with competition winners, or updates on their competitions
  • FastML: Various machine learning content, often with code
  • Edwin Chen: Infrequently updated, but long and thoughtful pieces
  • FiveThirtyEight: Tons of timely data-related content
  • Machine Learning Mastery: Frequent posts on machine learning, very accessible
  • Data School: Kevin Markham's blog! Beginner-focused, with reference guides and videos
  • MLWave: Detailed posts on Kaggle competitions, by a Kaggle Master
  • Data Science 101: Short, frequent content about all aspects of data science
  • ML in the Valley: Thoughtful pieces by the Director of Analytics at Codecademy


  • DataTau: Like Hacker News, but for data
  • MachineLearning on reddit: Very active subreddit
  • Quora's Machine Learning section: Lots of interesting Q&A
  • Quora's Data Science topic FAQ
  • KDnuggets: Data mining news, jobs, classes and more

DC Data Groups

  • Data Community DC: Coordinates six local data-related meetup groups
  • District Data Labs: Offers courses and other projects to local data scientists

Online Classes

  • Coursera's Data Science Specialization: Nine courses (running every month) and a Capstone project, taught in R
  • Stanford's Statistical Learning: By the authors of An Introduction to Statistical Learning and Elements of Statistical Learning, taught in R, highly recommended, running January through April 2015 (preview thelecture videos)
  • Coursera's Machine Learning (Andrew Ng): Andrew Ng's acclaimed course, taught in MATLAB/Octave (preview the lecture videos)
  • Coursera's Machine Learning (Pedro Domingos): No upcoming sessions (preview the lecture videos)
  • Caltech's Learning from Data: Widely praised, not language-specific
  • Udacity's Data Analyst Nanodegree: Project-based curriculum using Python, R, MapReduce, MongoDB
  • Coursera's Data Mining Specialization: Brand new specialization beginning February 2015
  • Coursera's Natural Language Processing: No upcoming sessions, but lectures and slides are available
  • SlideRule's Data Analysis Learning Path: Curated content from various online classes
  • Udacity's Intro to Artificial Intelligence: Taught by Peter Norvig and Sebastian Thrun
  • Coursera's Neural Networks for Machine Learning: Taught by Geoffrey Hinton, no upcoming sessions
  • statistics.com: Many online courses in data science
  • CourseTalk: Read reviews of online courses

Online Content from Offline Classes

  • Harvard's CS109 Data Science: Similar topics as General Assembly's course
  • Columbia's Data Mining Class: Excellent slides
  • Harvard's CS171 Visualization: Includes programming in D3

Face-to-Face Educational Programs

  • Comparison of data science bootcamps: Up-to-date list maintained by a Zipfian Academy graduate
  • The Complete List of Data Science Bootcamps & Fellowships
  • Zipfian Academy: Offers Data Science Immersive, Data Engineering Immersive, Master's in Big Data (San Francisco, but possibly expanding)
  • Data Science Retreat: Primarily uses R (Berlin)
  • Metis Data Science Bootcamp: Newer bootcamp (New York)
  • Persontyle: Various course offerings (based in London)
  • Software Carpentry: Two-day workshops, primarily for researchers and hosted by universities (worldwide)
  • Colleges and Universities with Data Science Degrees


  • Knowledge Discovery and Data Mining (KDD): Hosted by ACM
  • O'Reilly Strata + Hadoop World: Big focus on "big data" (San Jose, London, New York)
  • PyData: For developers and users of Python data tools (worldwide)
  • PyCon: For developers and users of Python (Montreal in April 2015)


  • An Introduction to Statistical Learning with Applications in R (free PDF)
  • Elements of Statistical Learning (free PDF)
  • Think Stats (free PDF or HTML)
  • Mining of Massive Datasets (free PDF)
  • Python for Informatics (free PDF or HTML)
  • Statistics: Methods and Applications (free HTML)
  • Python for Data Analysis
  • Data Smart: Using Data Science to Transform Information into Insight
  • Sams Teach Yourself SQL in 10 Minutes

Other Resources

  • Open Source Data Science Masters: Huge list of resources
  • Data Science Trello Board: Another list of resources
  • The Hitchhiker's Guide to Python: Online guide to understanding Python and getting good at it
  • Python Reference: Python tips, tutorials, and more
  • videolectures.net: Tons of academic videos
  • Metacademy: Quick summary of many machine learning terms, with links to resources for learning more
  • Terms in data science defined in one paragraph


本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2015-06-25,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 数据科学与人工智能 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

0 条评论
  • Blogs
  • Aggregators
  • DC Data Groups
  • Online Classes
  • Online Content from Offline Classes
  • Face-to-Face Educational Programs
  • Conferences
  • Books
  • Other Resources
云数据库 MongoDB
腾讯云数据库 MongoDB(TencentDB for MongoDB)是腾讯云基于全球广受欢迎的 MongoDB 打造的高性能 NoSQL 数据库,100%完全兼容 MongoDB 协议,支持跨文档事务,提供稳定丰富的监控管理,弹性可扩展、自动容灾,适用于文档型数据库场景,您无需自建灾备体系及控制管理系统。
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档