前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >IBM's SystemML machine learning system becomes Apache project

IBM's SystemML machine learning system becomes Apache project

作者头像
首席架构师智库
发布2018-04-09 11:53:07
4470
发布2018-04-09 11:53:07
举报
文章被收录于专栏:超级架构师超级架构师

There's a race between tech giants to open source machine learning systems and become a dominant platform. Apache SystemML has clear enterprise spin.

IBM on Monday said its machine learning system, dubbed SystemML, has been accepted as an open source project by the Apache Incubator.

The Apache Incubator is an entry to becoming a project of The Apache Software Foundation. The general idea behind the incubator is to ensure code donations adhere to Apache's legal guidelines and communities follow guiding principles.

IBM said it would donate SystemML as an open source project in June.

What's notable about IBM's SystemML milestone is that open sourcing machine learning systems is becoming a trend. To wit:

  1. Google recently open sourced its TensorFlow machine learning tool under an Apache 2.0 license.
  2. Facebook has also contributed its machine learning and artificial intelligence tools to the Torch open source project.

For enterprises, the upshot is that there will be a bevy of open source machine learning code bases to consider. Google TensorFlow and Facebook Torch are tools to train neural networks. SystemML is aimed a broadening the ecosystem to business use.

Why are tech giants going open source with their machine learning tools? The machine learning platform that gets the most data will learn faster and then become more powerful. That cycle will just result in more data to ingest. IBM is looking to work the enterprise angle on machine learning. Microsoft may be another entry on the enterprise side, but may not go the Apache route.

In addition, there are precedents to how open sourcing big analytics ideas can pay off. MapReduce and Hadoop started as open source projects and would be a cousin of whatever Apache machine learning system wins out.

IBM's SystemML, which is now Apache SystemML, is used to create industry specific machine learning algorithms for enterprise data analysis. IBM created SystemML so it could write one codebase that could apply to multiple industries and platforms. If SystemML can scale, IBM's Apache move could provide a gateway to its other analytics wares.

The Apache SystemML project has included more than 320 patches for everything from APIs, data ingestion and documentation, more than 90 contributions to Apache Spark and 15 additional organizations adding to the SystemML engine.

Here's the full definition of the Apache SystemML project:

SystemML provides declarative large-scale machine learning (ML) that aims at flexible specification of ML algorithms and automatic generation of hybrid runtime plans ranging from single node, in-memory computations, to distributed computations on Apache Hadoop and Apache Spark. ML algorithms are expressed in a R or Python syntax, that includes linear algebra primitives, statistical functions, and ML-specific constructs. This high-level language significantly increases the productivity of data scientists as it provides (1) full flexibility in expressing custom analytics, and (2) data independence from the underlying input formats and physical data representations. Automatic optimization according to data characteristics such as distribution on the disk file system, and sparsity as well as processing characteristics in the distributed environment like number of nodes, CPU, memory per node, ensures both efficiency and scalability.

The link to Apache:

http://systemml.incubator.apache.org/

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2015-12-04,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 首席架构师智库 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档