前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >CNN+MNIST+INPUT_DATA数字识别

CNN+MNIST+INPUT_DATA数字识别

原创
作者头像
代号404
修改2018-08-19 18:24:00
8820
修改2018-08-19 18:24:00
举报
文章被收录于专栏:Deep Learning 笔记

用经典卷积神经网络模型LeNet-5实现手写数字识别,模型如下图所示:

模型的详细结构:

流程图:

TALK IS CHEAP,SHOW ME THE CODE,先从MNIST数据集下载脚本Input_data开始


前篇:Input_data

代码语言:javascript
复制
 
  SOURCE_URL = 'http://yann.lecun.com/exdb/mnist/'
  
  def maybe_download(filename, work_directory):
  #检查是否已经从MNIST网站下载了所需数据
  if not os.path.exists(work_directory):  #检验目录是否存在,不存在就创建一个
    os.mkdir(work_directory)
    
  filepath = os.path.join(work_directory, filename)    
  if not os.path.exists(filepath):     #目标文件不存在就从网站下载
    filepath, _ = urllib.request.urlretrieve(SOURCE_URL + filename, filepath)
    statinfo = os.stat(filepath)
    print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')
  return filepath
  
  • os.path.exists(path) :如果path存在,返回True;如果path不存在,返回False
  • os.mkdir(path):用来创建一级目录,其参数path 为要创建目录的路径

>>> import os

>>> os.mkdir('E:\MNIST')

  • os.makedirs(path):用来创建多级目录

>>> import os

>>>os.makedirs('E:\\mnist\\tensor')

  • os.path.join():将多个路径组合后返回

>>> import os

>>> os.path.join('/hello/','tensor/','flow')

输出: /hello/tensor/flow

  • urllib.request.urlretrieve(url, filename=None, reporthook=None, data=None)
  • url:外部或者本地url
  • filename:表示保存到本地的路径,如果该参数为none,会自动生成一个临时文件
  • reporthook:当连接上服务器以及相应的数据块传输完毕的时候会触发该回调函数,可以用来显示下载进度。
  • data:指传递到服务器的数据,该方法返回一个包含两个元素的元组(filename, headers),filename表示保存到本地的路径,headers表示服务器的响应头
  • os.stat(path):返回路径文件的信息
代码语言:javascript
复制
def _read32(bytestream):

  dt = numpy.dtype(numpy.uint32).newbyteorder('>')  #参数">"表示采用大端法存储
  return numpy.frombuffer(bytestream.read(4), dtype=dt)[0]
  • 字节顺序是指多字节的值在硬件中的存储顺序, 一般分为大端(big-endian)和小端(little-endian)
  • 大端: 先存储高字节(Most significant bit),或者说,高字节存储在低地址,,低字节存储在高地址
  • numpy.frombuffer(buffer, dtype=float, count=-1, offset=0):将缓冲区的内容编译后,返回1维数组
代码语言:javascript
复制
def extract_images(filename):
  """Extract the images into a 4D uint8 numpy array [index, y, x, depth]."""
  print('Extracting', filename)
  with gzip.open(filename) as bytestream:
    magic = _read32(bytestream)
    if magic != 2051:   
     
    #校验魔数,2051相当于校验码,检测文件是否是源文件
    
      raise ValueError(
          'Invalid magic number %d in MNIST image file: %s' %
          (magic, filename))
          
    num_images = _read32(bytestream)
    rows = _read32(bytestream)
    cols = _read32(bytestream)
    buf = bytestream.read(rows * cols * num_images)
    data = numpy.frombuffer(buf, dtype=numpy.uint8)
    data = data.reshape(num_images, rows, cols, 1)
    return data
  • 提取图像,并返回4D数据:图片数量、长度、宽度和通道数(RGB图片是3,灰度图是1)
代码语言:javascript
复制
def dense_to_one_hot(labels_dense, num_classes=10):
  """Convert class labels from scalars to one-hot vectors."""
  num_labels = labels_dense.shape[0] 
  #获取标签的数量
  
  #numpy.arange(start,step,stop)创建始于start,终于stop,步长为step的等差数列
  index_offset = numpy.arange(num_labels) * num_classes
  labels_one_hot = numpy.zeros((num_labels, num_classes))
  
  #numpy.darray.flat将数组变换成一维;
  labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
  #numpy.ravel()将多维数组降位一维,并返回视图
  return labels_one_hot
  • 将稠密标签向量变成稀疏的标签矩阵,进行ONT-HOT编码
  • shape[0]:读取矩阵第一维度的长度
代码语言:javascript
复制
def extract_labels(filename, one_hot=False):
  """Extract the labels into a 1D uint8 numpy array [index]."""
  print('Extracting', filename)  
  with gzip.open(filename) as bytestream:
    magic = _read32(bytestream)
    if magic != 2049:
      raise ValueError(
          'Invalid magic number %d in MNIST label file: %s' %
          (magic, filename))
          
    num_items = _read32(bytestream)
    buf = bytestream.read(num_items)
    labels = numpy.frombuffer(buf, dtype=numpy.uint8)
    if one_hot:
      return dense_to_one_hot(labels)
    return labels
  • 读取标签,若one_hot为true,则转换成稀疏矩阵。例如 [3] >>>[0.0.0.1.0.0.0.0.0.0]
代码语言:javascript
复制
class DataSet(object):
  def __init__(self, images, labels, fake_data=False):
    if fake_data:
      self._num_examples = 10000
    else:
      assert images.shape[0] == labels.shape[0], (
          "images.shape: %s labels.shape: %s" % (images.shape,labels.shape))
      self._num_examples = images.shape[0]
      
      # Convert shape from [num examples, rows, columns, depth]
      # to [num examples, rows*columns] (assuming depth == 1)
      assert images.shape[3] == 1
      images = images.reshape(images.shape[0],
                              images.shape[1] * images.shape[2])
      # Convert from [0, 255] -> [0.0, 1.0].
      images = images.astype(numpy.float32)
      images = numpy.multiply(images, 1.0 / 255.0)
    self._images = images
    self._labels = labels
    self._epochs_completed = 0
    self._index_in_epoch = 0
    
  @property
  def images(self):
    return self._images
  @property
  def labels(self):
    return self._labels
  @property
  def num_examples(self):
    return self._num_examples
  @property
  def epochs_completed(self):
    return self._epochs_completed
  def next_batch(self, batch_size, fake_data=False):
    """Return the next `batch_size` examples from this data set."""
    if fake_data:
      fake_image = [1.0 for _ in xrange(784)]
      #下划线表示 临时变量, 仅用一次,后面无需再用到 
      fake_label = 0
      return [fake_image for _ in xrange(batch_size)], [
          fake_label for _ in xrange(batch_size)]
          
    start = self._index_in_epoch
    self._index_in_epoch += batch_size
    
    if self._index_in_epoch > self._num_examples:
      # Finished epoch
      self._epochs_completed += 1
      # Shuffle the data
      perm = numpy.arange(self._num_examples)
      numpy.random.shuffle(perm) 
      #随机打乱原顺序
      self._images = self._images[perm]
      self._labels = self._labels[perm]
      # Start next epoch
      start = 0
      self._index_in_epoch = batch_size
      assert batch_size <= self._num_examples
    end = self._index_in_epoch
    return self._images[start:end], self._labels[start:end]
  • assert:断言函数,若图片数量不等于标签数量,则在错误信息里输出(images.shape和labels.shape)
  • multiply(a,b):两个数组a,b对应的元素相乘
  • @property:装饰器函数,可以把 get 方法“装饰”成属性调用,更加简洁
  • numpy.random.shuffle(perm):效果如下所示
  • 输入 >>> perm = np.arange(10) >>> np.random.shuffle(perm) >>> perm 输出 [1 7 5 2 9 4 3 6 0 8]
代码语言:javascript
复制
def read_data_sets(train_dir, fake_data=False, one_hot=False):
  class DataSets(object):
    #python里pass用来占位
    pass
  data_sets = DataSets()
  if fake_data:
    data_sets.train = DataSet([], [], fake_data=True)
    data_sets.validation = DataSet([], [], fake_data=True)
    data_sets.test = DataSet([], [], fake_data=True)
    return data_sets
  TRAIN_IMAGES = 'train-images-idx3-ubyte.gz'
  TRAIN_LABELS = 'train-labels-idx1-ubyte.gz'
  TEST_IMAGES = 't10k-images-idx3-ubyte.gz'
  TEST_LABELS = 't10k-labels-idx1-ubyte.gz'
  VALIDATION_SIZE = 5000
  local_file = maybe_download(TRAIN_IMAGES, train_dir)
  train_images = extract_images(local_file)
  local_file = maybe_download(TRAIN_LABELS, train_dir)
  train_labels = extract_labels(local_file, one_hot=one_hot)
  local_file = maybe_download(TEST_IMAGES, train_dir)
  test_images = extract_images(local_file)
  local_file = maybe_download(TEST_LABELS, train_dir)
  test_labels = extract_labels(local_file, one_hot=one_hot)
  validation_images = train_images[:VALIDATION_SIZE]
  validation_labels = train_labels[:VALIDATION_SIZE]
  train_images = train_images[VALIDATION_SIZE:]
  train_labels = train_labels[VALIDATION_SIZE:]
  data_sets.train = DataSet(train_images, train_labels)
  data_sets.validation = DataSet(validation_images, validation_labels)
  data_sets.test = DataSet(test_images, test_labels)
  return data_sets
  • PASS:如果定义的函数内容为空的话,会报错。pass本身为空语句,主要作用就是占据位置,让代码整体完整


后篇:卷积神经网络CNN

代码语言:javascript
复制
mnist = read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, [None, 784])                      
y_= tf.placeholder(tf.float32, shape=[None, 10]) 
  • 读取文件,输入占位符x,y_
代码语言:javascript
复制
#定义一个函数,用于初始化所有的权值 W
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)#0.1
  return tf.Variable(initial)

#定义一个函数,用于初始化所有的偏置项 b
def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)#0.1
  return tf.Variable(initial)
  
#定义一个函数,用于构建卷积层
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

#定义一个函数,用于构建池化层
def max_pool(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')
  • tf.truncated_normal(shape, mean, stddev) :函数产生一个截断的正太分布,正太分布的值如果与均值的差值大于两倍的标准差,那就重新生成。shape表示生成张量的维度,mean是均值,stddev是标准差。
代码语言:javascript
复制
x_image = tf.reshape(x, [-1,28,28,1])         #转换输入数据shape,以便于用于网络中
W_conv1 = weight_variable([5, 5, 1, 32])      
b_conv1 = bias_variable([32])       
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)     #第一个卷积层
h_pool1 = max_pool(h_conv1)                                  #第一个池化层
  • x_image = tf.reshape(x, [-1,28,28,1]) : -1表示x的维度未知
代码语言:javascript
复制
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)      #第二个卷积层
h_pool2 = max_pool(h_conv2)                                   #第二个池化层
  • 将第一层的输出作为输入,池化后输出
代码语言:javascript
复制
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])              #reshape成向量
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)    #第一个全连接层
代码语言:javascript
复制
keep_prob = tf.placeholder("float") 
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)                  #dropout层
  • tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None):
  • TensorFlow里面为了防止或减轻过拟合而使用的函数,一般用在全连接层。
  • x 表示输入的tensor
  • keep_prob表示每个元素被保留下来的概率
代码语言:javascript
复制

W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_predict=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)   #softmax层

cross_entropy = -tf.reduce_sum(y_*tf.log(y_predict))     #交叉熵
train_step = tf.train.GradientDescentOptimizer(1e-3).minimize(cross_entropy)    #梯度下降法
correct_prediction = tf.equal(tf.argmax(y_predict,1), tf.argmax(y_,1))    
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))   
  • 使用梯度下降优化,其他优化方法还有Momentum,AdaGrad,RMSProp,Adam等

代码语言:javascript
复制
for i in range(30000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:                  #训练100次,验证一次
    train_acc = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
    print('step',i,'training accuracy',train_acc)
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})#0.5

test_acc=accuracy.eval(feed_dict={x:mnist.test.images[:2000], y_:mnist.test.labels[:2000], keep_prob: 1.0})
print("test accuracy",(test_acc))
  • 从MNIST中读取数据,训练模型
  • 运行结果如下:
代码语言:javascript
复制
step 29500 training accuracy 0.98
step 29600 training accuracy 0.96
step 29700 training accuracy 1.0
step 29800 training accuracy 0.96
step 29900 training accuracy 0.94
test accuracy 0.944


小实验:

代码语言:javascript
复制
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.15)  #0.1
  return tf.Variable(initial)
  
代码语言:javascript
复制
for i in range(20000):
代码语言:javascript
复制
test_acc=accuracy.eval(feed_dict={x:mnist.test.images[:1000], y_:mnist.test.labels[:1000], keep_prob: 1.0})

  • 稍微调整一下标准差,stddev改成0.15,训练次数改为20000,测试次数改为1000,然后运行程序,如下所示:
代码语言:javascript
复制
step 19000 training accuracy 0.1
step 19100 training accuracy 0.08
step 19200 training accuracy 0.14
step 19300 training accuracy 0.08
step 19400 training accuracy 0.06
step 19500 training accuracy 0.1
step 19600 training accuracy 0.12
step 19700 training accuracy 0.12
step 19800 training accuracy 0.08
step 19900 training accuracy 0.06
test accuracy 0.085
  • 结果无法收敛,超参数的调整也是看经验的


源代码:

代码语言:javascript
复制
# Copyright 2015 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Functions for downloading and reading MNIST data."""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf 
import gzip
import os
import numpy
from six.moves import urllib
from six.moves import xrange  # pylint: disable=redefined-builtin
SOURCE_URL = 'http://yann.lecun.com/exdb/mnist/'
def maybe_download(filename, work_directory):
  """Download the data from Yann's website, unless it's already here."""
  if not os.path.exists(work_directory):
    os.mkdir(work_directory)
  filepath = os.path.join(work_directory, filename)
  if not os.path.exists(filepath):
    filepath, _ = urllib.request.urlretrieve(SOURCE_URL + filename, filepath)
    statinfo = os.stat(filepath)
    print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')
  return filepath
 
def _read32(bytestream):
  #采用大尾端存储
  dt = numpy.dtype(numpy.uint32).newbyteorder('>')
  return numpy.frombuffer(bytestream.read(4), dtype=dt)[0]
#提取图片到四维uint8数组
def extract_images(filename):
  """Extract the images into a 4D uint8 numpy array [index, y, x, depth]."""
  print('Extracting', filename)
  with gzip.open(filename) as bytestream:
    magic = _read32(bytestream)
    if magic != 2051:
      raise ValueError(
          'Invalid magic number %d in MNIST image file: %s' %
          (magic, filename))
    num_images = _read32(bytestream)
    rows = _read32(bytestream)
    cols = _read32(bytestream)
    buf = bytestream.read(rows * cols * num_images)
    data = numpy.frombuffer(buf, dtype=numpy.uint8)
    data = data.reshape(num_images, rows, cols, 1)
    return data
#将稠密标签向量变成稀疏的标签矩阵
#eg:若原向量的第i行为3,则对应稀疏矩阵的第i行下标为3的值为1,其余为0
def dense_to_one_hot(labels_dense, num_classes=10):
  """Convert class labels from scalars to one-hot vectors."""
  #ndarray.shape展示数组的维度,shape[0]展示行
  num_labels = labels_dense.shape[0]
  #numpy.arange(start,step,stop)创建始于start,终于stop,步长为step的等差数列
  index_offset = numpy.arange(num_labels) * num_classes
  labels_one_hot = numpy.zeros((num_labels, num_classes))
  #numpy.darray.flat将数组变换成一维;numpy.ravel()返回视图
  labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
  return labels_one_hot
def extract_labels(filename, one_hot=False):
  """Extract the labels into a 1D uint8 numpy array [index]."""
  print('Extracting', filename)
  with gzip.open(filename) as bytestream:
    magic = _read32(bytestream)
    if magic != 2049:
      raise ValueError(
          'Invalid magic number %d in MNIST label file: %s' %
          (magic, filename))
    num_items = _read32(bytestream)
    buf = bytestream.read(num_items)
    labels = numpy.frombuffer(buf, dtype=numpy.uint8)
    if one_hot:
      return dense_to_one_hot(labels)
    return labels
class DataSet(object):
  def __init__(self, images, labels, fake_data=False):
    if fake_data:
      self._num_examples = 10000
    else:
      assert images.shape[0] == labels.shape[0], (
          "images.shape: %s labels.shape: %s" % (images.shape,
                                                 labels.shape))
      self._num_examples = images.shape[0]
      # Convert shape from [num examples, rows, columns, depth]
      # to [num examples, rows*columns] (assuming depth == 1)
      assert images.shape[3] == 1
      images = images.reshape(images.shape[0],
                              images.shape[1] * images.shape[2])
      # Convert from [0, 255] -> [0.0, 1.0].
      images = images.astype(numpy.float32)
      images = numpy.multiply(images, 1.0 / 255.0)
    self._images = images
    self._labels = labels
    self._epochs_completed = 0
    self._index_in_epoch = 0
    
  @property
  def images(self):
    return self._images
  @property
  def labels(self):
    return self._labels
  @property
  def num_examples(self):
    return self._num_examples
  @property
  def epochs_completed(self):
    return self._epochs_completed
  def next_batch(self, batch_size, fake_data=False):
    """Return the next `batch_size` examples from this data set."""
    if fake_data:
      fake_image = [1.0 for _ in xrange(784)]
      fake_label = 0
      return [fake_image for _ in xrange(batch_size)], [
          fake_label for _ in xrange(batch_size)]
    start = self._index_in_epoch
    self._index_in_epoch += batch_size
    if self._index_in_epoch > self._num_examples:
      # Finished epoch
      self._epochs_completed += 1
      # Shuffle the data
      perm = numpy.arange(self._num_examples)
      numpy.random.shuffle(perm)
      self._images = self._images[perm]
      self._labels = self._labels[perm]
      # Start next epoch
      start = 0
      self._index_in_epoch = batch_size
      assert batch_size <= self._num_examples
    end = self._index_in_epoch
    return self._images[start:end], self._labels[start:end]
def read_data_sets(train_dir, fake_data=False, one_hot=False):
  class DataSets(object):
    #python里pass用来占位
    pass
  data_sets = DataSets()
  if fake_data:
    data_sets.train = DataSet([], [], fake_data=True)
    data_sets.validation = DataSet([], [], fake_data=True)
    data_sets.test = DataSet([], [], fake_data=True)
    return data_sets
  TRAIN_IMAGES = 'train-images-idx3-ubyte.gz'
  TRAIN_LABELS = 'train-labels-idx1-ubyte.gz'
  TEST_IMAGES = 't10k-images-idx3-ubyte.gz'
  TEST_LABELS = 't10k-labels-idx1-ubyte.gz'
  VALIDATION_SIZE = 5000
  local_file = maybe_download(TRAIN_IMAGES, train_dir)
  train_images = extract_images(local_file)
  local_file = maybe_download(TRAIN_LABELS, train_dir)
  train_labels = extract_labels(local_file, one_hot=one_hot)
  local_file = maybe_download(TEST_IMAGES, train_dir)
  test_images = extract_images(local_file)
  local_file = maybe_download(TEST_LABELS, train_dir)
  test_labels = extract_labels(local_file, one_hot=one_hot)
  validation_images = train_images[:VALIDATION_SIZE]
  validation_labels = train_labels[:VALIDATION_SIZE]
  train_images = train_images[VALIDATION_SIZE:]
  train_labels = train_labels[VALIDATION_SIZE:]
  data_sets.train = DataSet(train_images, train_labels)
  data_sets.validation = DataSet(validation_images, validation_labels)
  data_sets.test = DataSet(test_images, test_labels)
  return data_sets



#------------------------------------------------------------------------------------------------------

mnist = read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, [None, 784])                        #输入的数据占位符
y_= tf.placeholder(tf.float32, shape=[None, 10])            #输入的标签占位符

#定义一个函数,用于初始化所有的权值 W
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)#0.1
  return tf.Variable(initial)

#定义一个函数,用于初始化所有的偏置项 b
def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)#0.1
  return tf.Variable(initial)
  
#定义一个函数,用于构建卷积层
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

#定义一个函数,用于构建池化层
def max_pool(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1], padding='SAME')

#构建网络
x_image = tf.reshape(x, [-1,28,28,1])         #转换输入数据shape,以便于用于网络中
W_conv1 = weight_variable([5, 5, 1, 32])      
b_conv1 = bias_variable([32])       
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)     #第一个卷积层
h_pool1 = max_pool(h_conv1)                                  #第一个池化层

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)      #第二个卷积层
h_pool2 = max_pool(h_conv2)                                   #第二个池化层

W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])              #reshape成向量
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)    #第一个全连接层

keep_prob = tf.placeholder("float") 
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)                  #dropout层

W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_predict=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)   #softmax层

cross_entropy = -tf.reduce_sum(y_*tf.log(y_predict))     #交叉熵
train_step = tf.train.GradientDescentOptimizer(1e-3).minimize(cross_entropy)    #梯度下降法
correct_prediction = tf.equal(tf.argmax(y_predict,1), tf.argmax(y_,1))    
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))                 #精确度计算

sess=tf.InteractiveSession()                          
sess.run(tf.global_variables_initializer())

for i in range(30000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:                  #训练100次,验证一次
    train_acc = accuracy.eval(feed_dict={x:batch[0], y_: batch[1], keep_prob: 1.0})
    print('step',i,'training accuracy',train_acc)
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})#0.5



test_acc=accuracy.eval(feed_dict={x:mnist.test.images[:2000], y_:mnist.test.labels[:2000], keep_prob: 1.0})
print("test accuracy",(test_acc))


参考资料:

字节顺序(byte order):https://blog.csdn.net/u011334621/article/details/52540543

numpy.dtype.newbyteorder:https://docs.scipy.org/doc/numpy/reference/generated/numpy.dtype.newbyteorder.html

numpy.frombuffer:https://docs.scipy.org/doc/numpy/reference/generated/numpy.frombuffer.html

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 前篇:Input_data
  • 后篇:卷积神经网络CNN
  • 小实验:
  • 源代码:
  • 参考资料:
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档