TVP

# 卷积神经网络CNN

,

It is important to note that the Convolution operation captures the local dependencies in the original image.

Depth：

depth与filter数量是一致的.不同的独立的filter提取出不同的特征. 比如如下,有3个filter对原始图片进行处理,得到的特征图可以理解为3个叠在一起的矩阵.

Stride

Zero-padding

Now, let’s take a look at padding. Before getting into that, let’s think about a scenario. What happens when you apply three 5 x 5 x 3 filters to a 32 x 32 x 3 input volume? The output volume would be 28 x 28 x 3. Notice that the spatial dimensions decrease. As we keep applying conv layers, the size of the volume will decrease faster than we would like. In the early layers of our network, we want to preserve as much information about the original input volume so that we can extract those low level features. Let’s say we want to apply the same conv layer but we want the output volume to remain 32 x 32 x 3. Todo this, we can apply a zero padding of size 2to that layer. Zero padding pads the input volume with zeros around the border. If we think about a zero padding of two, then this would result in a 36 x 36 x 3 input volume.

ReLU作用于卷积后得到的特征图. 将矩阵中的负数替换为0.

ReLU为卷积神经网络引入非线性.真实世界里的数据绝大多数是非线性的（卷积是一种线性操作）

The purpose of ReLU isto introduce non-linearity in our ConvNet, since most of the real-world data we would want our ConvNet to learn would be non-linear (Convolution is a linear operation – element wise matrix multiplication and addition, so we account for non-linearity by introducing a non-linear functionlike ReLU)

Spatial Pooling (also called subsampling or downsampling) reduces the dimensionality of each feature map but retains the most important information. Spatial Pooling can be of different types: Max, Average, Sum etc.

CNN的完整过程

step1:用随机值初始化所有的filter和参数/权重step2:CNN接收待识别图片作为输入,经过卷积,ReLU,池化,全连接层的一系列操作后,可以输出分类概率比如我们得到概率为[0.2, 0.4, 0.1, 0.3]由于权重值是随机设置的,所以第一次训练后得到的分类概率也是随机的.step3:计算分类结果的错误. Total Error = ∑ (target probability – output probability) step4:用反向传播算法计算Total Error 对网络中所有权值的梯度，利用梯度下降更新所有filter/权重值和参数值，使输出误差最小。权重根据它们对总误差的贡献按比例进行调整 对同一图片再进行训练,这时候得到的概率可能是[0.1, 0.1, 0.7, 0.1]，距离正确的目标[0, 0, 1, 0]又更近一步了.这说明神经网络通过调整weights/filters的值已经更好地学到了图片的信息. 在训练过程中,filters,filter size，cnn的结构这些是在step1之前就确定的,不会在训练过程中改变.只有filter矩阵的值和连接的权重会更新.step5:对训练集中的所有图片重复step2-step4.这样我们的神经网络的filter/parameters/weights这些就确定下来了,当来了一个新的图片,我们就可以按照前面说的卷积-池化-全连接这个过程处理图片对其进行分类.

cnn识别手写数字集

model = Sequential()

model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same',activation ='relu', input_shape = (28,28,1)))

model.add(Conv2D(filters = 64, kernel_size = (5,5),padding = 'Same', activation ='relu'))

model.add(MaxPool2D(pool_size=(2,2)))

model.add(Flatten())

model.add(Dense(256, activation = "relu"))

model.add(Dropout(0.2))

model.add(Dense(10, activation = "softmax"))

model.compile(optimizer = "SGD" , loss = "categorical_crossentropy", metrics=["accuracy"])

model.fit(X_train, Y_train, batch_size=100, verbose=1,epochs=5,validation_data=(X_val, Y_val))

• 发表于:
• 原文链接https://kuaibao.qq.com/s/20190129A0G0YE00?refer=cp_1026
• 腾讯「腾讯云开发者社区」是腾讯内容开放平台帐号（企鹅号）传播渠道之一，根据《腾讯内容开放平台服务协议》转载发布内容。
• 如有侵权，请联系 cloudcommunity@tencent.com 删除。

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17

2024-04-17