前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Caffe学习:Blobs, Layers, and Nets

Caffe学习:Blobs, Layers, and Nets

作者头像
bear_fish
发布2018-09-19 15:56:39
4260
发布2018-09-19 15:56:39
举报
文章被收录于专栏:用户2442861的专栏

原文

  • 深度神经网络(Deep networks)是由许多相互关联的Layer组成的。Caffe定义了一个layer-by-layer的network。底层是data,顶层是loss。运算数据在network中进行forward and backward passes(正向迭代和反向迭代),为此Caffe定义了Blob,用来存储、传递、处理数据。Blob是Caffe的标准数据结构、统一的内存接口。Layer是Caffe模型、运算的基础。network是Layer的集合。Blob描述了数据是如何在Layer和Net中存储和传递的。
  • Solving(求解方法)被单独定义,以解耦模型和配置。

Blob storage and communication

  • Blob是对数据的包装,用来存储、传递数据,并且能够实现CPU和GPU之间 的无缝切换。其实,Blob是一个N维数组。Blob提供了一个统一的内存接口用于存储数据,比如图片数据,模型参数,以及中间数据。Blob隐藏了CPU和GPU混合操作的细节,实现了CPU主机和GPU设备之间的无缝切换。
  • 常用的用于存储图片数据的Bolb的维度是: number N x channel K x height H x width W,按行(而不是列)存储,所以(n, k, h, w) 是指第 ((n * K + k) * H + h) * W + w个值。
  • Number / N是每一次处理的数据大小,使用批处理能够实现更好的效果。
  • Channel / K是特征的维度,例如对于RGB图片而言,K=3。
  • 作为参数的Blob,维度根据所在的Layer而不同。例如,对于一个96滤波器,11×11维度,3输入的Convolution Layer,其Blob维度是96×3×11×11。对于一个1000channels输出,1024channels输入的Inner Product Layer,其Blob维度是1000×1024。

Implementation Details

  • 对于网络中的数据,我们在意的是values(值)和gradients(梯度),所以一个Blob单元在内存中存储了两块数据 => data和diff。前者是我们输入网络的数据,后者是网络计算得出的gradients(梯度)。
  • 数据可以存放在CPU或者GPU,所以有两种不同的方式去访问数据:the const way(值不变方式)、the mutable way(值改变的方式):
代码语言:javascript
复制
<code class="hljs php has-numbering"><span class="hljs-keyword">const</span> Dtype* cpu_data() <span class="hljs-keyword">const</span>;
Dtype* mutable_cpu_data();
(similarly <span class="hljs-keyword">for</span> gpu <span class="hljs-keyword">and</span> diff).</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li></ul>
  • 这样设计的原因是Blob中使用一个SyncedMem类在CPU和GPU中同步数据,以隐藏底层细节,最小化数据传输损失。经验:如果不想改变数据,使用the const way(值不变方式),但是不要在自定义类中存储指针。当要得到指针时,调用方法去得到指针,此时SyncedMem类会确定是否需要复制数据(在CPU和GPU之间)。
  • 当使用GPU时,CPU将数据读取到Blob中,调起设备进行GPU运算,忽略底层的细节,实现高效运算。所有的Layer都有GPU实现,所以所有中间 data and gradients (数据及运算得出的梯度)都会保存在GPU中。
  • 下面是一个例子,用于确定何时Blob会复制数据:
代码语言:javascript
复制
<code class="hljs r has-numbering">// 假设刚开始数据存储在CPU中,我们定义了一个Blob
const Dtype* foo;
Dtype* bar;
foo = blob.gpu_data(); // 数据复制:cpu->gpu
foo = blob.cpu_data(); // 没有数据复制过程,因为数据都是最新的
bar = blob.mutable_gpu_data(); // 没有数据复制过程
// <span class="hljs-keyword">...</span> 一些中间操作 <span class="hljs-keyword">...</span>
bar = blob.mutable_gpu_data(); // 如果此时还是GPU模式,没有数据复制过程
foo = blob.cpu_data(); // 数据复制:gpu->cpu, 如果GPU更改过数据
foo = blob.gpu_data(); // 没有数据复制过程,因为数据都是最新的
bar = blob.mutable_cpu_data(); // 没有数据复制过程
bar = blob.mutable_gpu_data(); // 数据复制:cpu->gpu
bar = blob.mutable_cpu_data(); // 数据复制:gpu->cpu</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li></ul>

Layer computation and connections

  • Layer是Caffe模型最重要的部分,也是进行运算的基本单元。Layer可以进行很多运算,如:convolve(卷积)、pool(聚合)、inner product(内积)、rectified-linear (非线性)、sigmoid(阶跃)以及其他elementwise transformations,normalize(标准化)、load data(数据导入)、compute losses(损失计算)。在layer catalogue查看Layer种类,包括了所有深度学习任务所涉及的最先进的类型。
这里写图片描述
这里写图片描述
  • Layer从bottom接受input,从top输出output。每一个Layer都定义了3种至关重要的运算:setup-初始化、forward-前向迭代、backward-反向迭代。
  • Setup:初始化Layer以及该Layer与其他Layer的连接
  • Forward:从bottom接受input,进行该Layer定义的计算后,从top输出output
  • Backward:从top接受output的gradient,计算出input的gradient,从bottom输出
  • 对于Forward和Backward都有两种实现:CPU和GPU方式。
  • 由于Caffe的高度模块化,自定义Layer是很简单的(不信)。只要定义好:setup-初始化、forward-前向迭代、backward-反向迭代这三个方法,就可以将该Layer加入net之中。

Net definition and operation

  • Caffe Net的所有Layer的output(前向迭代)实现要完成的任务,反向迭代计算loss的gradient以进行学习,优化参数。Caffe模型是端到端的学习引擎。
  • Net是由一系列Layer组合形成的网状图,a directed acyclic graph(一个有向无环图)。Caffe会板保存所有中间运算值以确保forward and backward passes(前向迭代和反向迭代)的正确性。典型的Net开始于一个data layer,用于接受数据输入,结束于一个loss layer,用于实现最终的任务,例如classification or reconstruction(分类与重建)。
  • Net被定义为一系列的Layer及其连接表示,in a plaintext modeling language(用一种明文建模语言表示)。一个简单的logistic regression classifier(逻辑回归分类器),如下图:
logreg
logreg

可以如下定义:

代码语言:javascript
复制
<code class="hljs css has-numbering"><span class="hljs-tag">name</span>: "<span class="hljs-tag">LogReg</span>"
<span class="hljs-tag">layer</span> <span class="hljs-rules">{
  <span class="hljs-rule"><span class="hljs-attribute">name</span>:<span class="hljs-value"> <span class="hljs-string">"mnist"</span>
  type: <span class="hljs-string">"Data"</span>
  top: <span class="hljs-string">"data"</span>
  top: <span class="hljs-string">"label"</span>
  data_param {
    source: <span class="hljs-string">"input_leveldb"</span>
    batch_size: <span class="hljs-number">64</span>
  </span></span></span>}
}
<span class="hljs-tag">layer</span> <span class="hljs-rules">{
  <span class="hljs-rule"><span class="hljs-attribute">name</span>:<span class="hljs-value"> <span class="hljs-string">"ip"</span>
  type: <span class="hljs-string">"InnerProduct"</span>
  bottom: <span class="hljs-string">"data"</span>
  top: <span class="hljs-string">"ip"</span>
  inner_product_param {
    num_output: <span class="hljs-number">2</span>
  </span></span></span>}
}
<span class="hljs-tag">layer</span> <span class="hljs-rules">{
  <span class="hljs-rule"><span class="hljs-attribute">name</span>:<span class="hljs-value"> <span class="hljs-string">"loss"</span>
  type: <span class="hljs-string">"SoftmaxWithLoss"</span>
  bottom: <span class="hljs-string">"ip"</span>
  bottom: <span class="hljs-string">"label"</span>
  top: <span class="hljs-string">"loss"</span>
</span></span></span>}</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li><li>14</li><li>15</li><li>16</li><li>17</li><li>18</li><li>19</li><li>20</li><li>21</li><li>22</li><li>23</li><li>24</li><li>25</li><li>26</li><li>27</li></ul>
  • Net::Init()是模型初始化函数,主要完成两件事:新建Blob和Layer对象以搭建整个网络图,调用Layer的SetUp()方法。同时会确认整个Net的结构是否正确,打印初始化日志:
代码语言:javascript
复制
<code class="hljs css has-numbering"><span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.931977</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:39</span>] <span class="hljs-tag">Initializing</span> <span class="hljs-tag">net</span> <span class="hljs-tag">from</span> <span class="hljs-tag">parameters</span>:
<span class="hljs-tag">name</span>: "<span class="hljs-tag">LogReg</span>"
<span class="hljs-attr_selector">[...model prototxt printout...]</span>
# <span class="hljs-tag">construct</span> <span class="hljs-tag">the</span> <span class="hljs-tag">network</span> <span class="hljs-tag">layer-by-layer</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.932152</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:67</span>] <span class="hljs-tag">Creating</span> <span class="hljs-tag">Layer</span> <span class="hljs-tag">mnist</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.932165</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:356</span>] <span class="hljs-tag">mnist</span> <span class="hljs-tag">-</span>> <span class="hljs-tag">data</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.932188</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:356</span>] <span class="hljs-tag">mnist</span> <span class="hljs-tag">-</span>> <span class="hljs-tag">label</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.932200</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:96</span>] <span class="hljs-tag">Setting</span> <span class="hljs-tag">up</span> <span class="hljs-tag">mnist</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.935807</span> 2079114000 <span class="hljs-tag">data_layer</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:135</span>] <span class="hljs-tag">Opening</span> <span class="hljs-tag">leveldb</span> <span class="hljs-tag">input_leveldb</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.937155</span> 2079114000 <span class="hljs-tag">data_layer</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:195</span>] <span class="hljs-tag">output</span> <span class="hljs-tag">data</span> <span class="hljs-tag">size</span>: 64,1,28,28
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.938570</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:103</span>] <span class="hljs-tag">Top</span> <span class="hljs-tag">shape</span>: 64 1 28 28 (50176)
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.938593</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:103</span>] <span class="hljs-tag">Top</span> <span class="hljs-tag">shape</span>: 64 (64)
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.938611</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:67</span>] <span class="hljs-tag">Creating</span> <span class="hljs-tag">Layer</span> <span class="hljs-tag">ip</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.938617</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:394</span>] <span class="hljs-tag">ip</span> <<span class="hljs-tag">-</span> <span class="hljs-tag">data</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.939177</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:356</span>] <span class="hljs-tag">ip</span> <span class="hljs-tag">-</span>> <span class="hljs-tag">ip</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.939196</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:96</span>] <span class="hljs-tag">Setting</span> <span class="hljs-tag">up</span> <span class="hljs-tag">ip</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.940289</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:103</span>] <span class="hljs-tag">Top</span> <span class="hljs-tag">shape</span>: 64 2 (128)
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941270</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:67</span>] <span class="hljs-tag">Creating</span> <span class="hljs-tag">Layer</span> <span class="hljs-tag">loss</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941305</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:394</span>] <span class="hljs-tag">loss</span> <<span class="hljs-tag">-</span> <span class="hljs-tag">ip</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941314</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:394</span>] <span class="hljs-tag">loss</span> <<span class="hljs-tag">-</span> <span class="hljs-tag">label</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941323</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:356</span>] <span class="hljs-tag">loss</span> <span class="hljs-tag">-</span>> <span class="hljs-tag">loss</span>
# <span class="hljs-tag">set</span> <span class="hljs-tag">up</span> <span class="hljs-tag">the</span> <span class="hljs-tag">loss</span> <span class="hljs-tag">and</span> <span class="hljs-tag">configure</span> <span class="hljs-tag">the</span> <span class="hljs-tag">backward</span> <span class="hljs-tag">pass</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941328</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:96</span>] <span class="hljs-tag">Setting</span> <span class="hljs-tag">up</span> <span class="hljs-tag">loss</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941328</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:103</span>] <span class="hljs-tag">Top</span> <span class="hljs-tag">shape</span>: (1)
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941329</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:109</span>]     <span class="hljs-tag">with</span> <span class="hljs-tag">loss</span> <span class="hljs-tag">weight</span> 1
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941779</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:170</span>] <span class="hljs-tag">loss</span> <span class="hljs-tag">needs</span> <span class="hljs-tag">backward</span> <span class="hljs-tag">computation</span>.
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941787</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:170</span>] <span class="hljs-tag">ip</span> <span class="hljs-tag">needs</span> <span class="hljs-tag">backward</span> <span class="hljs-tag">computation</span>.
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941794</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:172</span>] <span class="hljs-tag">mnist</span> <span class="hljs-tag">does</span> <span class="hljs-tag">not</span> <span class="hljs-tag">need</span> <span class="hljs-tag">backward</span> <span class="hljs-tag">computation</span>.
# <span class="hljs-tag">determine</span> <span class="hljs-tag">outputs</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941800</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:208</span>] <span class="hljs-tag">This</span> <span class="hljs-tag">network</span> <span class="hljs-tag">produces</span> <span class="hljs-tag">output</span> <span class="hljs-tag">loss</span>
# <span class="hljs-tag">finish</span> <span class="hljs-tag">initialization</span> <span class="hljs-tag">and</span> <span class="hljs-tag">report</span> <span class="hljs-tag">memory</span> <span class="hljs-tag">usage</span>
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941810</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:467</span>] <span class="hljs-tag">Collecting</span> <span class="hljs-tag">Learning</span> <span class="hljs-tag">Rate</span> <span class="hljs-tag">and</span> <span class="hljs-tag">Weight</span> <span class="hljs-tag">Decay</span>.
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941818</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:219</span>] <span class="hljs-tag">Network</span> <span class="hljs-tag">initialization</span> <span class="hljs-tag">done</span>.
<span class="hljs-tag">I0902</span> 22<span class="hljs-pseudo">:52</span><span class="hljs-pseudo">:17</span><span class="hljs-class">.941824</span> 2079114000 <span class="hljs-tag">net</span><span class="hljs-class">.cpp</span><span class="hljs-pseudo">:220</span>] <span class="hljs-tag">Memory</span> <span class="hljs-tag">required</span> <span class="hljs-tag">for</span> <span class="hljs-tag">data</span>: 201476</code><ul style="" class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li><li>14</li><li>15</li><li>16</li><li>17</li><li>18</li><li>19</li><li>20</li><li>21</li><li>22</li><li>23</li><li>24</li><li>25</li><li>26</li><li>27</li><li>28</li><li>29</li><li>30</li><li>31</li><li>32</li><li>33</li><li>34</li></ul>

-注意:网络结构是设备无关的,Blob和Layer=隐藏了模型定义的具体实现细节。定义网络结构后,可以通过Caffe::mode()或者Caffe::set_mode()在CPU和GPU模式间切换。Layer在CPU和GPU模式下运算的结果是一致的(忽略计算误差)。CPU和GPU模式间是无缝切换的,并且独立于模型定义。

模型定义

  • 模型定义在prototxt文件中,而学习到的模型被二进制序列化为caffemodel文件。模型格式在caffe.proto文件中定义,源文件是self-explanatory的,所以鼓励大家去看。
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2016年08月24日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 原文
  • Blob storage and communication
    • Implementation Details
    • Layer computation and connections
    • Net definition and operation
    • 模型定义
    相关产品与服务
    数据保险箱
    数据保险箱(Cloud Data Coffer Service,CDCS)为您提供更高安全系数的企业核心数据存储服务。您可以通过自定义过期天数的方法删除数据,避免误删带来的损害,还可以将数据跨地域存储,防止一些不可抗因素导致的数据丢失。数据保险箱支持通过控制台、API 等多样化方式快速简单接入,实现海量数据的存储管理。您可以使用数据保险箱对文件数据进行上传、下载,最终实现数据的安全存储和提取。
    领券
    问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档