数据科学家/数据工程师

As the field of data science continues to grow and mature, it is nice to begin seeing some distinction in the roles of a data scientist. A new job title gaining popularity is the data engineer. In this post, I lay out some of the distinctions between the 2 roles.

Data Scientist

A data scientist is responsible for pulling insights from data. It is the data scientists job to pull data, create models, create data products, and tell a story. A data scientist should typically have interactions with customers and/or executives. A data scientist should love scrubbing a dataset for more and more understanding.

The main goal of a data scientist is to produce data products and tell the stories of the data. A data scientist would typically have stronger statistics and presentation skills than a data engineer.

Data Engineer

Data Engineering is more focused on the systems that store and retrieve data. A data engineer will be responsible for building and deploying storage systems that can adequately handle the needs. Sometimes the needs are fast real-time incoming data streams. Other times the needs are massive amounts of large video files. Still other times the needs are many many reads of the data. In other words, a data engineer needs to build systems that can handle the 3 Vs of big data.

The main goal of data engineer is to make sure the data is properly stored and available to the data scientist and others that need access. A data engineer would typically have stronger software engineering and programming skills than a data scientist.

Conclusion

It is too early to tell if these 2 roles will ever have a clear distinction of responsibilities, but it is nice to see a little separation of responsibilities for the mythical all-in-one data scientist. Both of these roles are important to a properly functioning data science team.

Do you see other distinctions between the roles?

原文链接http://101.datascience.community/2014/07/08/data-scientist-vs-data-engineer

原文发布于微信公众号 - 数据科学与人工智能(DS_AI_shujuren)

原文发表时间:2016-12-17

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏沃趣科技

ASM 翻译系列第四十弹:理解ASM中 REQUIRED_MIRROR_FREE_MB和USABLE_FILE_MB的含义

原作者:Harald van Breederode 译者: 魏兴华 审核: 魏兴华 DBGeeK社区联合出品 原文链接:https://prutse...

409120
来自专栏乐沙弥的世界

[INS-20802] Oracle Net Configuration Assistant failed

        [INS-20802] Oracle Net Configuration Assistant failed。在安装Oracle 11g R2时出...

43140
来自专栏V站

PHP5.4 + Zend Opcache 加速 wordpress 小结

五一期间,把 VPS 上的 PHP 加速组件换成了 Zend Opcache,打开页面的速度有了非常明显可以直接感受到的提升。这里顺便做一下小结,作为备忘。

32840
来自专栏CodingBlock

Android查缺补漏(IPC篇)-- 进程间通讯基础知识热身

本文作者:CodingBlock 文章链接:http://www.cnblogs.com/codingblock/p/8479282.html

11220
来自专栏Java技术分享

SSM三大框架整合详细总结(Spring+SpringMVC+MyBatis)

使用 SSM ( Spring 、 SpringMVC 和 Mybatis )已经很久了,项目在技术上已经没有什么难点了,基于现有的技术就可以实现想要的功能,当...

2.2K130
来自专栏CodingBlock

Android查缺补漏(IPC篇)-- 进程间通讯基础知识热身

本文作者:CodingBlock 文章链接:http://www.cnblogs.com/codingblock/p/8479282.html 在Android...

29760
来自专栏JadePeng的技术博客

jenkins X实践系列(2) —— 基于jx的DevOps实践

jx是云原生CICD,devops的一个最佳实践之一,目前在快速的发展成熟中。最近调研了JX,这里为第2篇,使用已经安装好的jx来实践CICD,旨在让大家了解基...

68720
来自专栏黄希彤的专栏

用 yum 把服务器的 php 升级到 7

有个 discuz 论坛一直用的是 php5.3.3,php7 出来以后看到大片大片的好评,性能大幅度的提升,心里就种草了。正好 discuz 官方最近也从3....

1.1K10
来自专栏battcn

一起来学SpringBoot | 第二十五篇:打造属于你的聊天室(WebSocket)

WebSocket 是 HTML5 新增的一种在单个 TCP 连接上进行全双工通讯的协议,与 HTTP 协议没有太大关系....

23020
来自专栏IT进修之路

原 荐 最新SpringCloud 服务注入

42230

扫码关注云+社区

领取腾讯云代金券