TVP

# 快速使用 Tensorflow 读取 7 万数据集！

Brief 概述

.jpeg: height, width, channels；

.png : height, width, channels, alpha。

（注意： .png 储存格式的图片含有透明度的信息，在处理图片的时候可以舍弃。）

Binary Data 二进制数据

Reasons for using binary data，使用二进制数据的理由

The approach to load images 读取数据的方法

import gzip,os

import numpy as np

location =input('The directory of MNIST dataset: ')

path=os.path.join(location,'train-images-idx3-ubyte.gz')

try:

with gzip.open(path,'rb') as fi:

images_flat_all = data_i.reshape(-1,784)

print(images_flat_all)

print('----- Separation -----')

print('Size of images_flat: ',len(images_flat_all))

except:

print("The file directory doesn't exist!")

###----- Result is shown below ----- ###

The directory of MNIST dataset: /home/abc/MNIST_data

[[0 0 0 ... 0 0 0]

[0 0 0 ... 0 0 0]

[0 0 0 ... 0 0 0]

...

[0 0 0 ... 0 0 0]

[0 0 0 ... 0 0 0]

[0 0 0 ... 0 0 0]]

----- Separation -----

Size of images_flat:60000

path_label =os.path.join(location,'train-labels-idx1-ubyte.gz')

with gzip.open(path_label,'rb') as fl:

print(data_l)

print('----- Separation -----')

print('Size of images_labels: ',len(data_l),type(data_l[]))

###----- Result is shown below ----- ###

[54...568]

----- Separation -----

Size of images_labels:60000

Explanation to the code 代码说明

MNIST 原始数据中直到第十六位数才开始描述图像信息，而数据标签则是第八位就开始描述信息，因此 offset 设置从第十六或是八位开始读取；

Linear Model 线性模型

import numpy as np

import tensorflow as tf

x_data = np.random.rand(100).astype(np.float32)

y_data = x_data * 0.1 + 0.3

weight = tf.Variable(tf.random_uniform(shape=[1], minval=-1.0, maxval=1.0))

bias = tf.Variable(tf.zeros(shape=[1]))

y = weight * x_data + bias

loss = tf.reduce_mean(tf.square(y - y_data))

training = optimizer.minimize(loss)

sess = tf.Session()

init = tf.global_variables_initializer()

sess.run(init)

for step in range(101):

sess.run(training)

if step % 10 == 0:

print('Round {}, weight: {}, bias: {}'

.format(step, sess.run(weight[0]), sess.run(bias[0])))

MNIST in Linear Model

batch size： 每一批次训练图片的数量需要调控以免内存不够；

loss function: 损失函数的原理是计算预测和实际答案之间的差距。

Accuracy()# Accuracy before doing anything

optimize(10); Accuracy()# Iterate 10 times

optimize(1000); Accuracy()# Iterate 10 + 1000 times

optimize(10000); Accuracy()# Iterate 10 + 1000 + 10000 times

### ----- Results are shown below ----- ###

Accuracy on TestSet:11.51%

AccuracyonTestSet:68.37%

AccuracyonTestSet:86.38%

AccuracyonTestSet:89.34%

Annotation No.1 tf.matmul(x_train, weights)

Reason of using one_hot()

Finally

wrong_predicted_images(pic_num=[3, 3], label_number=5)

sess.close()

CSDN 原文：

https://blog.csdn.net/Kuo_Jun_Lin/article/details/82106711?utm_source=copy

• 发表于:
• 原文链接https://kuaibao.qq.com/s/20181027A11WUP00?refer=cp_1026
• 腾讯「腾讯云开发者社区」是腾讯内容开放平台帐号（企鹅号）传播渠道之一，根据《腾讯内容开放平台服务协议》转载发布内容。
• 如有侵权，请联系 cloudcommunity@tencent.com 删除。

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

2023-09-26

Get大咖技术交流圈