人工智能:什么是真实的?什么是炒作?

人工智能:什么是真实的?什么是炒作?

——依赖国外开源算法,中国AI面临“软件组装厂”困境

文|秦陇纪,源|Billy Zhang,科学Sciences20190505Mon

人工智能是计算机技术发展到高级阶段,融合了数学、统计学、概率、逻辑、伦理等多学科于一身的复杂系统,是当下所有信息技术所不能达到的高级应用。其最为核心的技术便是人工智能算法。如何让计算机能像人类一样进行思考,如同人一样利用现有的知识进行学习并实现合乎逻辑的推理,是人工智能算法试图实现的目标。其技术绝不是一般公司能够轻轻松松实现的。当下国际社会公认的人工智能研发顶尖公司,如Google和IBM等投入了海量资源,动用了顶尖的数学科学家、计算机专家,能实现了计算机程序的一定程度智能化,但距离真正的AI仍然相差很远。

“徐匡迪之问”对当下中国人工智能直击核心,不但表明当下中国人工智能发展的短板,也揭去了披在当下所谓“人工智能”算法外表华丽的面纱。“我国人工智能领域真正搞算法的科学家凤毛麟角。”,东南大学生物科学与医学工程学院教授万遂人表示,“徐匡迪之问”直击我国人工智能发展的核心关键问题,“如果这种情况不改变,我国人工智能应用很难走向深入、也很难获得重大成果”。中国制造正从“硬件组装厂”向“软件组装厂”蔓延,政产学研浮躁如故、积习难改。发表于2019年4月17日的领英文章《人工智能:什么是真实的,什么是炒作》(作者BillyZhang),亦表达了类似观点。下面是其文章中文翻译,原文附于文后。

人工智能:什么是真实的,什么是炒作

人工智能(AI)可以做很多事情,并且比人类做得更好。这些都有很好的记录和广泛报道。

但是伴随着这些真正的人工智能正在做的事情真的很棒,很多炒作,虚假的AI和彻头彻尾的欺骗(虚假的AI)。

例如,不久前(2018年9月)人工智能在中国国际会议上同声传译背后的说法被证明是欺诈行为。

AI(真正的AI)真的拥有神秘的超级大国情报,最终会超越人类的智慧,因为许多夸张的说法可能会让我们相信吗?

如果剥离AI洋葱的所有层,机器学习、监督学习、无监督学习、强化学习、深度学习、神经网络(NN)、人工神经网络、复发神经网络、感知器、乙状结肠神经元、突触、前馈、反向传播在AI的核心,人们看到算法使用简单(通常是线性)函数来近似(“猜测”)主题是什么,无论是人体中的肿块,还是手写的字母或数字、天气或玩游戏。

算法是执行操作的数学技术,可以是加法和减法,也可以是蒙特卡罗树搜索和梯度下降搜索。这些算法长期以来作为解决物理和数学问题以及计算机模拟中数值分析的一部分而存在。使用梯度下降搜索算法将实验数据拟合到数学函数(有时称为模型),以找到自计算机出现以来科学界已使用实验数据的第一原理参数。

在物理学和数学中,第一原理参数本身就是使用搜索算法的目标。这些参数揭示了所研究主题的物理性质(例如我的研究项目中表面膜的分子结构,使用梯度搜索算法将实验数据拟合到包含分子结构参数的函数)。在AI中,近似函数中的参数(权重和偏差)是中介,其本身并不具有任何物理意义。它们存储(存储)在计算机中,仅用于后续计算和预测。因此,参数未被看到或隐藏在“机器学习”黑匣子中。这是导致机器学习和人工智能的“神秘”和“神奇”性质错觉的原因之一。(魔术师表演魔法。我们都知道魔术是由魔术师灵巧的手法所造成的幻想。我们都可以对这些知识充满信心。魔术师不具备超人的力量。)

机器学习中的“机器”一词在图灵的论文(COMPUTING MACHINERY&INTELLIGENCE)中被称为“数字计算机”。“学习”仅仅是搜索算法中固有的重复(迭代)步骤,以连续微调或更新(学习)参数,直到近似最接近正确(已知)答案。在数学中,这在均方误差最小化时实现。机器学习中的“训练”不过是使用给定的数据来找到近似函数中的参数(权重和偏差)。

使用拟人化的术语来描述AI是完全正确的。除了“训练”和“学习”之外,还可以听到“神经网络”。“神经网络”只是一种计算机编程架构。“神经元”是程序中产生结果的一个步骤。人们还听说“机器在没有明确编程的情况下自动学习和改进经验”,而事实上机器学习只不过是一套计算机程序。算法通过编程实现。“明确地”这个词,夹在“没有......编程”之间,好像独立于人的参与或干预,是一种技巧的巧妙。关于机器“自我学习”的能力是如此误导性的说法,它引起了所有关于机器学习(以及延伸,AI)同时具有神奇,神秘,令人生畏,惊恐和可怕的感觉。事实是,使用数值分析算法进行机器学习是非常困难甚至不可能明确编程的。

媒体被这些人格化的术语所淹没。在关于机器学习和人工智能的炒作中,“人工”这个词经常是故意或潜意识地(通常是后者)被省略。“人造”意味着人造。真正的人工智能真的是人工制造的智能在工作,这意味着人类的智慧在工作。更智能的AI仅仅意味着更聪明的人类智能(例如,更好的算法或更高效的编程或更快的计算机等)。

这不是为了贬低机器学习和人工智能。人工智能可以做很多事情,并且比人类做得更好。但是,人工智能超出它的真实范围会导致误解,虚假和欺诈。

将信息化与智能化混为一谈也很常见。主题为另一天。

真正的AI能否在特定任务中击败人类?是的,就像AlphaGo击败最佳人类围棋选手一样。但计算机自问世以来就击败了人类。计算机执行数学计算的速度比任何人都快。计算机本身就是人类智慧的产物。人类使用的任何工具都可以比人类更好地完成特定任务。这正是工作中人类智慧的体现。石头可以做得比人类裸露的手更好,就像开裂核桃一样。计算机是人类发明的一种工具,可以帮助人类更好地完成某些任务(在这种情况下,许多任务)。人工智能也是人类发明的一种工具,可以帮助人类更好地完成某些任务。可以肯定的是,在过去的几十年中,在人工智能研究过程中没有发明新的算法,只有应用现有算法的新方法。新的应用程序可以称为发明吗?有待辩论。

AI会取代员工吗?当然,就像计算机一样(或人类历史上的任何新工具)。但是,虽然新工具取代了现有的劳动力,但他们也创造了更多新的劳动力。用于PCB组装的自动拾取机器取代了许多工作台工人,但它们也实现了更多的创新,这些创新需要比工作台工人手动组装更高的装配精度,从而创造更多的设计工程工作。虽然人工智能将取代现有的劳动力,但它已经对数据工作者(数据科学家,数据工程师等)和人工智能程序员产生了巨大的需求。例子是无限的。

数据位于AI的前端和中心。所有搜索算法都会产生更好(更准确)的结果和更大的数据(当然会随着时间的推移而折衷。数据越大,产生结果所需的时间就越多)。此外,数据的质量和完整性与数据量一样重要(听起来很简单,但经常在炒作中丢失)。遗憾的是,许多关于数据的炒作也是如此,特别是在大数据领域,有时会导致灾难性的投资决策。

但是,情报本身与数据量无关。爱因斯坦通过纯粹的智慧制定了他的相对论(特殊和一般),没有任何数据支持。开普勒通过分析Brahe获得的有限观测数据来诱导行星运动定律,这反过来导致了牛顿万有引力定律。这种推理和归纳/演绎能力和以封闭形式表达结论的能力(牛顿定律,麦克斯韦方程,薛定谔方程等)是人类智慧的独特之处。

并非所有“大数据”都是大数据,并非所有决策都需要大数据或人工智能。观察一个人只吃一次或最多三到四次的晚餐,足以确定这个人是否是素食主义者。它并不需要关于此人和AI的“大数据”来做出这样的决定。

使用大数据做出业务决策也不是新鲜事。六西格玛(SixSigma)是一种数据驱动的方法,用于企业提高产品和服务的质量,例如汽车制造,半导体工厂或金融服务。用于评估金融风险和欺诈检测的大数据的统计分析已经确立,这些都是数据挖掘的一部分。

通常,在封闭形式的表达式(如天气)中描述系统的行为是不切实际或不可能的。天气受到太多变量的影响,不仅在本地,而且在区域,有时甚至是全球。数值分析(在超级计算机上,并且曾经被称为计算机模拟,现在称为AI)已被用于模拟和预测天气。但我们都经历过(或许甚至遭受过)天气预报的不准确之处。为什么?不是因为算法或计算机速度,而是因为缺乏完整的数据,即使可用的数据已经很大。这是一个双重打击删除:缺乏用于训练的完整数据(机器学习)意味着近似函数中的精确权重和偏差不足,加上缺乏用于预测的完整数据,仍然加剧了预测准确性。要预测准确的天气,需要进行更多的第一原理研究,了解天气如何受到温度、压力、湿度、风、空气中污染颗粒等的影响,超过每平方英里或每10平方英里或任何粒度,超过五十平方英里或二百平方英里或整个大陆。即使在实现了这样的理解之后,挑战仍然是以这样的粒度收集数据以输入到进行预测的算法。

从某种意义上说,AI用于识别Go的手写或玩游戏是在“干净和完整的数据”环境中。一切都是透明和可用的。在大多数其他实际应用程序中并非如此。以招聘高绩效员工为例。第一个HR必须定义什么构成高性能。其次,HR必须在许多样本(人数)上识别出导致(或至少与之相关)高性能的属性集。既不容易也没有错误。高性能不仅仅是个人属性的功能。它受到许多其他因素的影响,如公司文化和办公室政治。不准确或错误的数据会导致不准确或错误的预测。在人力资源部门,这可能仅仅意味着招聘人才。但在医疗保健方面,不准确,错误甚至无关的数据可能会导致生死攸关的后果。无论AI在医疗保健应用中发现什么,都必须经过训练有素的眼睛和头脑仔细检查。相关关系与因果关系不同。

同样,在自动驾驶中,数据环境不仅“不干净”,而且可能被严重“污染”。主题为另一天。

人工智能可以做很多事情,并且比人类做得更好。但人工智能永远不会取代或超越人类的智慧。引用斯坦福大学医学院院长罗伊德·米诺(Lloyd Minor)教授的话说,“人工智能,曾经是一门学科,现在正处于改变医疗保健的尖端。机器可能永远不会取代受过训练的眼睛,但它已经扩展并增强了人类的视力,使我们能够看到我们从未知道的东西“(https://stan.md/2IlJAk6)。与计算机辅助设计(CAD)非常相似,AI是一种可以扩展和增强人类能力的辅助工具。

来源:比利·张(技术前沿-人工智能-物联网-可再生能源)

资料:https://www.linkedin.com/pulse/artificial-intelligence-what-real-hype-billy-zhang/

ArtificialIntelligence: What Is Real & What Is Hype

Published on 2019 4 17 Billy Zhang Technology Frontiers -AI - IoT - Renewable Energy

Artificial Intelligence(AI) can do lots of things and do many of them better than humans can. Theseare well documented and widely reported.

But along with these realAI doing really great things come a lot of hypes, false AI and outrightdeceptions (fake AI).

For example, the claimthat AI was behind the simultaneous interpretations in an internationalconference in China not long ago (September 2018) was proven to be a fraud.

Does AI (the real AI)really have the mystical superpower intelligence that eventually will surpasshuman intelligence, as many hyperbole claims may lead us to believe?

If one peels away all thelayers of the AI onion, of machine learning, supervised learning, unsupervisedlearning, reinforcement learning, deep learning, neural networks (NN),artificial NN, recurrent NN, perceptrons, sigmoid neurons, synapses,feedforward, backpropagation, weights, biases, etc., etc., at the core of AIone sees algorithms that use simple (often linear) functions to approximate(“guess”) what the subject matter is, be it a lump in the human body, ahandwritten letter or digit, the weather or playing a game.

An algorithm is amathematical technique that performs operations, be they addition andsubtraction or Monte Carlo tree search and gradient descent search. Thesealgorithms have long existed as part of numerical analysis in solving problemsin physics and mathematics, and in computer simulations. Using gradient descentsearch algorithm to fit experimental data to a mathematical function (sometimescalled a model) to find the first-principle parameters that underly theexperimental data has been used in the scientific community since the advent ofcomputers.

In physics andmathematics, the first-principle parameters are in and of themselves theobjectives of using search algorithms. These parameters reveal the physicalproperties of the subject matter being studied (such as the molecular structureof surface films in my research project using gradient search algorithm to fitexperimental data to a function which contained molecular structureparameters). In AI, the parameters (weights and biases) in the approximatingfunction are intermediaries which in and of themselves are not of any physicalsignificance. They are stored (memorized) in the computer and are merely usedin subsequent computations and predictions. Hence the parameters are not seenor are hidden in the “machine learning” black box. This is one of the reasonsthat give rise to the illusion of “mysterious” and “magical” nature of machinelearning and AI. (Magicians perform magics. We all know that the magics areillusions created by magicians’ deft sleight of hand. We can all haveconfidence in that knowledge. Magicians do not possess superhuman powers.)

The word “machine” inmachine learning was referred to as the “digital computer” in Turing’s paper(COMPUTING MACHINERY & INTELLIGENCE). “Learning” is no more than therepetitive (iterative) steps intrinsic in the search algorithms to successivelyfine-tune or update (learn) the parameters until the approximation is closest tothe correct (known) answer. In mathematics, this is achieved when the meansquare error is minimized. “Training” in machine learning is no more than usingthe given data to find the parameters (weights and biases) in the approximatingfunction.

It’s perfectly fine touse personified terminologies to describe AI. In addition to “training” and“learning”, one hears of “neural networks”. A “neural network” is just acomputer programming architecture. A “neuron” is a step in the program thatgenerates an outcome. One also hears that “machines automatically learn andimprove from experience without being explicitly programmed”, while in fact machinelearning is nothing but a set of computer programs. Algorithms areimplemented through programming. The word “explicitly”, sandwiched between“without being … programmed”, as if independent of human’s involvement orintervention, is a finesse of logic. It is such misleading statement aboutmachine’s ability to “self-learn” that gives rise to all the feelings aboutmachine learning (and by extension, AI) being magical, mysterious, daunting,awing and frightening, all at the same time. The fact is that beingprohibitively difficult or even impossible to explicitly program is the reasonthat machine learning – using numerical analysis algorithms - is used.

The media get carriedaway by these personified terminologies. In the hypes about machines learningand AI, the word “artificial” is often deliberately or subconsciously (usuallythe latter) omitted. “Artificial” means human-made. Real AI really ishuman-made intelligence at work, which means human intelligence at work.More intelligent AI merely means cleverer human intelligence (e.g., betteralgorithm or more efficient programming or faster computer, etc.).

This is not to belittlemachine learning and AI. AI can do lots of things and do many of them betterthan humans can. But hyping AI beyond what it really is leads tomisunderstanding, falsehoods and fraud.

It’s also common toconfuse informatization with intelligentization. Topic for another day.

Can real AI beat humansin specific tasks? Yes, like AlphaGo beating best human Go players. Butcomputers have beat humans since their advent. Computers perform mathematicalcalculations faster than any human can. Computer itself is a product of humanintelligence. Any tools used by humans can do specific tasks better than humanscan. That’s precisely a manifestation of human intelligence at work. Stones cando a better job than human’s bare hand can, like in cracking walnuts. Computeris a tool invented by humans to help humans do certain tasks (in this case,many tasks) better. AI is also a tool invented by humans to help humans docertain tasks better. Just to be sure, no new algorithms have been invented inthe course of AI research in the last decades, only new ways of applyingexisting algorithms. Can new applications be called inventions? Subject todebate.

Will AI displaceworkforce? Absolutely, just like computers did (or any new tool in the humanhistory). But while new tools displaced existing workforce, they also createdfar more new workforce. The automated pick’n place machines for PCB assemblydisplaced lots of bench workers, but they also enabled more innovations whichrequired higher assembly accuracy than what manual assembly by bench workerscould achieve, thus creating more design engineering jobs. While AI willdisplace existing workforce, it has already created huge demand for dataworkers (data scientists, data engineers, etc.) and AI programmers. Examplesare limitless.

Data is at the front andcenter of AI. All the search algorithms yield better (more accurate) resultswith bigger data (there of course is trade off with time. The bigger the data,the more time it takes to yield results). In addition, quality and integrity ofdata are just as important as quantity of data (sounds like a no brainer, butoften lost in the hypes). Unfortunately, lots of hypes are created about dataas well, particularly in the field of Big Data, sometimes leading to disastrousinvestment decisions.

However, intelligenceitself is not related to the quantity of data. Einstein formulated his theoriesof relativity (Special and General) through sheer intelligence, without anydata to support. Kepler induced the law of planet motion through analyzinglimited observational data obtained by Brahe, which in turn led to Newton’s lawof universal gravitation. Such reasoning and induction/deduction abilities andcapability to express conclusions in closed-form expressions (Newton’s laws,Maxwell’s equations, Schrodinger equation, etc.) are unique of humanintelligence.

Not all “big data” areBig Data, and not all decisions require big data or AI to make. Observing whata person orders for dinner just once, or three to four times at most, is enoughto determine if the person is a vegetarian. It does not take “big data” aboutthe person nor AI to make that determination.

Using big data to makebusiness decisions is not new either. Six Sigma is a data-driven methodologyused in businesses to improve quality of products and services, such as in carmanufacturing, semiconductor fabs or financial services. Statistical analysisof big data for assessment of financial risks and detection of fraud is wellestablished, all part of data mining.

Often it is impracticalor impossible to describe behavior of a system in closed-form expressions, suchas weather. Weather is affected by too many variables, not just locally butregionally and sometimes even globally. Numerical analysis (on supercomputers,no less, and used to be called computer simulation and now called AI) has beenused to simulate and forecast weather. But we all have experienced (perhapseven suffered from) the inaccuracies of weather forecast. Why? Not because of algorithmsor computer speed, but because of lack of complete data even though what dataare available are already big. This is a double whammy takedown: lack ofcomplete data used for training (machine learning) means less than accurateweights and biases in the approximating function which, combined with lack ofcomplete data used for forecasting, aggravates still the forecasting accuracy.To forecast accurate weather requires a lot more first-principle research intohow weather is affected by, say, the distribution of temperature, pressure,humidity, wind, polluting particles in the air, etc., over every square mile orevery ten square miles or whatever granularity, over fifty square miles or twohundred square miles or an entire continent. Even after such understanding isachieved, challenge remains in gathering data with such granularity to input tothe algorithms making forecast.

In a sense, AI forrecognition of handwritten or playing games of Go are in a “clean and completedata” environment. Everything is transparent and available. Not so in mostother real-world applications. Take as an example AI for hiring highperformance employees. First HR must define what constitutes high performance.Secondly HR must identify the set of attributes that are causes of (or at leastcorrelated with) high performance, over many samples (headcounts). Neither iseasy and error-free. High performance is not just a function of personalattributes. It’s affected by many other factors such as company culture andoffice politics. Inaccurate or erroneous data result in inaccurate or erroneouspredictions. In HR, it may simply mean making a bad hire. But in healthcare,inaccurate, erroneous or even irrelevant data may have life or deathconsequences. Whatever AI finds in healthcare applications must be doublechecked by trained eyes and minds. Correlation relationship is not the same ascausal-effect relationship.

Likewise, in autonomousdriving, the data environment is not just “not clean”, it could be heavily“polluted”. Topic for another day.

AI can do lots of thingsand do many of them better than humans can. But AI will never replace orsurpass human intelligence. To quote Prof. Lloyd Minor, Dean of School ofMedicine at Stanford University, “Artificial intelligence, once an academicdiscipline, is now on the cusp of transforming health care. A machine may nevertake the place of the trained eye, but it’s already extending and enhancinghuman vision, allowing us to see things we never knew were there”(https://stan.md/2IlJAk6). Much like computer-aided design (CAD), AI is an aidthat extends and enhances human abilities.

Billy Zhang

Technology Frontiers - AI- IoT - Renewable Energy


参考文献(350字)

1. Billy Zhang. ArtificialIntelligence: What Is Real & What Is Hype. [EB/OL], elecfans, https://www.linkedin.com/pulse/artificial-intelligence-what-real-hype-billy-zhang/,Published on 2019 4 17, visiting date: 2019-05-05.

x. 秦陇纪. 西方哲学与人工智能、计算机; 数据科学与大数据技术专业概论; 人工智能研究现状及教育应用; 信息社会的数据资源概论; 纯文本数据溯源与简化; 大数据简化技术体系. [EB/OL],数据简化DataSimp(微信公众号), http://www.datasimp.org, 2017-06-06.

—END—

免责说明:资料来自公开期刊媒体资料,文章只为学术新闻信息传播,注明出处参考文献可溯源。本公号不持有任何倾向性,亦不表示认可其观点或其所述


人工智能:什么是真实的?什么是炒作?(15300字)

(PDF公号发“AI真实炒作”下载)

秦陇纪2010-2019©科学Sciences

科学Sciences导读:人工智能:什么是真实的?什么是炒作?真正的人工智能正在做的事情真的很棒,但也伴随着很多炒作、虚假的AI和彻头彻尾的欺骗(虚假的AI)。中国制造正从“硬件组装厂”向“软件组装厂”蔓延,政产学研浮躁如故、积习难改。

人工智能:什么是真实的?什么是炒作?(15300字)

目录

A人工智能:什么是真实的?什么是炒作?(14180字)

人工智能:什么是真实的,什么是炒作

ArtificialIntelligence: What Is Real & What Is Hype

参考文献(350字)


A人工智能:什么是真实的?什么是炒作?(14180字)

Sciences242人工智能:什么是真实的?什么是炒作?KS20190505MonQinDragon.docx

简介:人工智能:什么是真实的?什么是炒作?作者:秦陇纪。素材:领英/知识简化/数据简化社区NC非商业授权/秦陇纪微信群聊公众号,参考文献附引文出处。下载:如需本文15k字1图6页PDF资料,赞赏支持后,公号输入栏发送关键字“AI真实炒作”获取链接;关注“科学Sciences”文章分类菜单。版权:科普文章仅供学习研究,公开资料©版权归原作者,请勿用于商业非法目的。数据简化社区保留相应版权,若有引文/译注/出处不明或遗漏/版权问题等,请给公号留言或邮件咨询QinDragon2010@qq.com。转载:请写明并保留作者、出处、时间等信息,如“此文出自:©科学Sciences,作者:秦陇纪,时间:20190505Mon©秦陇纪2010-2019汇译编”等字样。

科学Sciences”公众科普分享

跋:科学传入我国整整一百年过去了,还是没有普及、被国人普遍接受。科学精神是假设和质疑,科学方法是实验和测量,科学理论的本质是科学家用数学工具对自然社会做从出定性定量解释。近卌百年,有些民族对自然社会的思考,最肤浅地就是盲信盲从情感型表达的模糊不清的简单语言;而理性之人分析具体的现象,直到以数学等工具为主的科学思维。科学实验、科学假说,均需工程技术支撑,理论和技术均丰富了科学之躯,切不可止步于语文工具之表象思维。更不可把科技当成语文来对待,拿书本文字代替实验设计工程实践。科学是璀璨的人类文明之一,但有其范围并非万能。科学Sciences公号不持有任何倾向性,只提供大家的学术观点。感谢您的阅读!《科学Sciences》倡导"理性之思想,自主之精神",专注于学者、学界、学术的发展进步,不定期向您推荐人类优秀学者及其文章。欢迎科学、工程、技术、教育、传媒等业界专家投稿、加入数据简化社区!欢迎大家分享、赞赏、支持科普~~

原文发布于微信公众号 - 科学Sciences(SciencesPub)

原文发表时间:2019-05-05

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

编辑于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券