专栏首页技术沉淀03 Types of Learning

03 Types of Learning

Output Space/Data Label/Protocol/Input Space四个维度介绍常见机器学习类型,见详细课件

Output Space

的维度考虑,不同的输出空间,对应不同的机器学习算法。

Binary Classification

二分类问题,输出空间为

。常见例子比如:

  • credit approve/disapprove
  • email spam/non-spam
  • patient sick/not sick
  • ad profitable/not profitable

是极其重要的一类问题:

Core and important problem with many tools as building block of other tools.

Multiclass Classification

多分类问题,输出空间为

,二分类是

时候的特例。常见例子比如:

  • coin recognition
  • written digits ⇒ 0, 1, · · · , 9
  • pictures ⇒ apple, orange, strawberry
  • emails ⇒ spam, primary, social, promotion, update
Regression

回归问题,输出空间

或者

,对应bounded regression。常见的例子比如:

  • patient features ⇒ how many days before recovery
  • company data ⇒ stock price
  • climate data ⇒ temperature

统计学中被广泛研究:

Also core and important with many ‘statistical’ tools as building block of other tools.

Structured Learning

结构化学习,常见例子比如:

  • sentence ⇒ structure (class of each word)(序列标注)
  • protein data ⇒ protein folding
  • speech data ⇒ speech parse tree

Huge multiclass classification problem (structure = hyperclass) without ‘explicit’ class definition.

Data Label

从data label

的有无、多少、形式划分:

  • supervised: all
  • unsupervised: no
  • semi-supervised: some
  • reinforcement: implicit

by goodness

Supervised Learning

Supervised learning: every

comes with corresponding

.

比如二分类、多分类问题,都是典型的监督学习。

Unsupervised Learning

Unsupervised learning: diverse, with possibly very different performance goals.

无监督学习形式也很丰富,常见的比如:

  • clustering
    • unsupervised multiclass classification
    • i.e. articles ⇒ topics
  • density estimation
    • unsupervised bounded regression
    • traffic reports with location ⇒ dangerous areas
  • outlier detection
    • extreme ‘unsupervised binary classification’
    • i.e. Internet logs ⇒ intrusion alert
Semi-supervised Learning

Semi-supervised learning: leverage unlabeled data to avoid ‘expensive’ labeling.

常见的比如:

  • face images with a few labeled ⇒ face identifier (Facebook)
  • medicine data with a few labeled ⇒ medicine effect predictor

详细解释见Semi-supervised learning

Reinforcement Learning

Reinforcement: learn with ‘partial/implicit information’ (often sequentially).

样本形式

常见的比如:

  • (customer, ad choice, ad click earning) ⇒ ad system
  • (cards, strategy, winning amount) ⇒ black jack agent

Different Protocol

不同Protocol对应不同Learning Philosophy:

  • batch: duck feeding
  • online: passive sequential
  • active: question asking (sequentially)(query the

of the chosen

)

对应的训练数据也不相同:

  • batch: all known data
  • online: sequential (passive) data
  • active: strategically-observed data
Batch Learning

一次性从所有已知数据中学习。

Batch supervised multiclass classification: learn from all known data.

  • batch of (email, spam?) ⇒ spam filter
  • batch of (patient, cancer) ⇒ cancer classifier
  • batch of patient data ⇒ group of patients
Online Learning

序列地接受数据,然后更新模型。

Online: hypothesis ‘improves’ through receiving data instances sequentially

比如online spam filter, which sequentially:

  1. observe an email
  1. predict spam status with current
  1. receive ‘desired label’

from user, and then update

with

PLA can be easily adapted to online protocol.

Active Learning

当模型没有把握的时候,把问题交给用户,从而获取高质量样本。

Active: improve hypothesis with fewer labels (hopefully) by asking questions strategically

Different Input Space

根据输入空间的含义划分。

Concrete Features

Concrete features: each dimension of

represents ‘sophisticated physical meaning’.

常见的比如:

  • (size, mass) for coin classification
  • customer info for credit approval
  • patient info for cancer diagnosis
  • often including human intelligence on the task

这些具体特征,有明确的含义,可解释性很强,同时easy for ML

Raw Features

Raw features: often need human or machines to convert to concrete ones.

比如image pixels, speech signal等场景。

Abstract Features

Abstract: again need feature conversion/extraction/construction.

比如一些ID特征:

  • student ID in online tutoring system (KDDCup 2010)
  • advertisement ID in online ad system

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Python: json模块实例详解

    JSON(JavaScript Object Notation) 是一种轻量级的数据交换格式。易于人阅读和编写。同时也易于机器解析和生成。

    用户2183996
  • Ruby练习二input: ['cars', 'for', 'potatoes', 'racs', 'four','scar', 'creams', 'scream']=> output: [["c

    用户2183996
  • Numpy练习

    用户2183996
  • json解析BOM问题,can't decode byte in position 0

    Json(javascript object notation)是基于javascript(standard ECMA-262 3rd Edition-Dece...

    震八方紫面昆仑侠
  • 18.7.11日报

    1,修复https://passport.liepin.com/e/account#sfrom=click-pc_homepage-front_navigati...

    龙泉寺扫地僧
  • 用R玩转微店汇总报表

    (这个地方就很符合jimmy大神的价值观:只允许用打开R-project的方式打开Rstudio,小本本记下来,小心被怼)

    生信技能树
  • Node.js 基础

    梨涡浅笑
  • 深入浅出 Nodejs ( 一 ) :Nodejs 的简介

    我认为 Node 是一门独具风格的技术,它的特点很有意思,本章我们主要讲 Node 的特点,Node 应用场景以及 Node 的使用者。

    serena
  • 【专业技术】Node.js 究竟是什么?

    简介 如果您听说过 Node,或者阅读过一些文章,宣称 Node 是多么多么的棒,那么您可能会想:“Node 究竟是什么东西?” 即便是在参阅 Node 的主页...

    程序员互动联盟
  • 第一章:NodeJS 概述

    Node 概述 什么是 Node Node.js® is a JavaScript runtime built on Chrome's V8 JavaScrip...

    老马

扫码关注云+社区

领取腾讯云代金券