前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >原创译文 | 直击苹果发布会,深度学习功能Create ML,似乎看起来没什么用?

原创译文 | 直击苹果发布会,深度学习功能Create ML,似乎看起来没什么用?

作者头像
灯塔大数据
发布2018-07-25 16:53:05
5010
发布2018-07-25 16:53:05
举报
文章被收录于专栏:灯塔大数据灯塔大数据
导读:今年WWDC苹果一款硬件都没有发布,被称为“史上最软苹果发布会”。苹果发布了 iOS12、macOS Mojave,MacOS和iOS的联动堪称生产力工具,但是很多人忽略了苹果面向开发者推出的 Create ML 功能,本文将进行详细介绍。(文末更多往期译文推荐)

苹果在发布会上向开发者推出了一项新功能——Create ML。

机器学习目前已成为开发者的常用工具,因而苹果也想要做出如此改进。但目前推出的本质上是对本地应用进行训练,看起来用处不大。

但最重要的一步是机器学习模型的建立,就比如能够识别面部和语言并将之转化为文本,就是“训练”。计算机也通过“训练”处理大量诸如图片和音频类的数据,并将音频转化为合适的文字。

训练的过程是极其占据CPU的。机器学习所需的算力和内存与我们平时工作所需有着数量级的差距,就好比制作一部商业大电影对比打一局游戏。你也可以在笔记本电脑上完成机器学习,但是四核的Intel处理器和板载GPU实在算力太小,可能要花费几十小时或者几天的时间。

正因如此,“训练”一般在云端完成,因为云端可以集合多台计算机的算力。

Create ML的意义在于,让你在你自己的笔记本上就能完成机器学习。就介绍来看,把数据拖放到界面上,进行一些个性化设置,如果你使用的是顶配iMac Pro,只需20分钟即可准备好训练模型。它还会压缩模型,以便你可以更轻松地将其应用在APP里(这些功能似乎已包含在Apple ML工具中)。这主要是因为它应用了Apple自己的愿景和语言模型,而不是从头构建新的模型。

但实际上,模型的质量在很大程度上取决于训练网络的“层”的性质、安排和精度,以及训练的时间。比如使用MacBook Pro训练,一小时可完成十万亿次的训练量。如果您将这些数据发送到云端,您可以选择在10台计算机之间分配这十万亿次的训练量,在6分钟内即可获得相同的结果,或者可以在一小时内完成百万亿次的训练量,反正肯定会得到一个更好的模型。

这种灵活性是计算服务的核心便利之一,所以目前有很多公司提供云服务,像亚马逊云服务、深蓝云服务等。

一般人不会把敏感的数据放在云端存储,比如病史或者X光片。况且,我认为那些没有经验的单一开发者也根本获取不了某些敏感数据。一块装载有500000人PET扫描数据的硬盘简直就是一场灾难。所以真正的隐私数据都是集中存储的,不会放在云端。

研究机构、医院和大学都与云服务有合作关系,甚至可能有它们自己专用的计算集群。但他们的要求也是不同的,苹果的产品还达不到需求。

笔者似乎在有意挖苦这种本地学习的模式。但苹果的设计理念让人觉得,任何人都可以轻松地把专业的“训练”转到自己的笔记本电脑上,并得到同样的结果。这是不切实际的。也许未来随着平台的多样化,开发人员应用这种本地“训练”模式,但目前它感觉像是一个没什么目的的功能。

原文

Apple’s Create ML is a nice feature with an unclear purpose

Apple announced a new feature for developers today called Create ML. Because machine learning is a commonly used tool in the developer kit these days, it makes sense that Apple would want to improve the process. But what it has here, essentially local training, doesn’t seem particularly useful.

The most important step in the creation of a machine learning model, like one that detects faces or turns speech into text, is the “training.” That’s when the computer is chugging through reams of data like photos or audio and establishing correlations between the input (a voice) and the desired output (distinct words).

This part of the process is extremely CPU-intensive, though. It generally requires orders of magnitude more computing power (and often storage) than you have sitting on your desk. Think of it like the difference between rendering a 3D game like Overwatch and rendering a Pixar film. You could do it on your laptop, but it would take hours or days for your measly four-core Intel processor and onboard GPU to handle.

That’s why training is usually done “in the cloud,” which is to say, on other people’s computers set up specifically for the task, equipped with banks of GPUs and special AI-inclined hardware.

Create ML is all about doing it on your own PC, though: as briefly shown onstage, you drag your data onto the interface, tweak some stuff and you can have a model ready to go in as little as 20 minutes if you’re on a maxed-out iMac Pro. It also compresses the model so you can more easily include it in apps (a feature already included in Apple ML tools, if I remember correctly). This is mainly possible because it’s applying Apple’s own vision and language models, not building new ones from scratch.

The quality of a model depends in great part on the nature, arrangement and precision of the “layers” of the training network, and how long it’s been given to cook. Given an hour of real time, a model trained on a MacBook Pro will have, let’s just make up a number, 10 teraflop-hours of training done. If you send that data to the cloud, you could choose to either have those 10 teraflop-hours split between 10 computers and have the same results in six minutes, or after an hour it could have 100 teraflop-hours of training, almost certainly resulting in a better model.

That kind of flexibility is one of the core conveniences of computing as a service, and why so much of the world runs on cloud platforms like AWS and Azure, and soon dedicated AI processing services like Lobe.

My colleagues suggested that people who are dealing with sensitive data in their models, for example medical history or x-rays, wouldn’t want to put that data in the cloud. But I don’t think that single developers with little or no access to cloud training services are the kind that are likely, or even allowed, to have access to privileged data like that. If you have a hard drive loaded with the PET scans of 500,000 people, that seems like a catastrophic failure waiting to happen. So access control is the name of the game, and private data is stored centrally.

Research organizations, hospitals and universities have partnerships with cloud services and perhaps even their own dedicated computing clusters for things like this. After all, they also need to collaborate, be audited and so on. Their requirements are also almost certainly different and more demanding than Apple’s off the shelf stuff.

I guess I sound like I’m ragging for no reason on a tool that some will find useful. But the way Apple framed it made it sound like anyone can just switch over from a major training service to their own laptop easily and get the same results. That’s just not true. Perhaps as the platform diversifies developers will find ways to make it useful, but for now it feels like a feature without a purpose.

文章编辑:小柳

往期译文推荐:

原创译文 | 中国学校应用人工智能为学生批作文,与老师打分相差无几

原创译文 | 研究人员利用虚拟现实技术训练人工智能无人机,减少无人驾驶汽车的碰撞

原创译文 | 英伟达的詹森•黄谈白宫的人工智能倡议

原创译文 | 比特币上涨,金融专家起诉Facebook加密货币快速致富广告骗局

原创译文 | 为什么AI不能解决Facebook的虚假新闻问题

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2018-06-06,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 灯塔大数据 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 原创译文 | 中国学校应用人工智能为学生批作文,与老师打分相差无几
  • 原创译文 | 研究人员利用虚拟现实技术训练人工智能无人机,减少无人驾驶汽车的碰撞
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档