自动驾驶汽车的交通标志识别

代码医生工作室

发布于 2020-02-21 15:38:35

1.4K0

发布于 2020-02-21 15:38:35

文章被收录于专栏：相约机器人

作者 | Aditya Mankar

来源 | Medium

编辑 | 代码医生团队

内容：

动机。
了解数据集。
步骤0：导入库和数据集。
步骤1：数据预处理。
步骤2：数据可视化。
ConvNets背后的直觉。
步骤3：训练模型。
步骤4：模型评估。

动机：

由于特斯拉等公司在电动汽车自动化方面的努力，无人驾驶汽车正变得非常受欢迎。为了成为5级自动驾驶汽车，这些汽车必须正确识别交通标志并遵守交通规则。在识别出这些交通标志之后，它还应该能够适当地做出正确的决定。

了解数据集：

德国交通标志基准测试是在2011年国际神经网络联合会议（IJCNN）上举行的多类单图像分类挑战。请在此处下载数据集。数据集具有以下属性：

https://www.kaggle.com/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign

单图像，多分类问题
超过40个类别
总共超过50,000张图像
大型逼真的数据库

步骤0：导入库和数据集：

在第一步中，将导入所有标准库以及将作为数据和标签存储的数据集。导入Tensorflow是为了使用Keras，cv2解决计算机视觉相关的问题以及PIL处理不同的图像文件格式。

# Importing standard libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import tensorflow as tf
from PIL import Image
# Importing dataset
import os
data = []
labels = []
classes = 43
cur_path = os.getcwd()
for i in range(classes):
    path = os.path.join(cur_path, 'train', str(i))
    images = os.listdir(path)

    for a in images:
        try:
            image = Image.open(path + '\\'+ a)
            image = image.resize((30, 30))
            image = np.array(image)
            data.append(image)
            labels.append(i)
        except:
            print("Error loading image")

步骤1：资料预处理：

为了处理数据，将使用numpy将其转换为数组。然后，使用形状函数验证数据集的尺寸。然后，使用train_test_split函数以80:20的比率将数据集分为训练和测试数据。Y_train和Y_test包含43个整数形式的类，不适合模型。因此，将使用to_categorical函数将其转换为二进制形式。

# Converting to array
data = np.array(data)
labels = np.array(labels)
# Dataset Dimensions - (Number of Images, Width, Length, Color channels)
print("Dataset dimensions : ",data.shape)
output:   
Dataset dimensions :  (39209, 30, 30, 3)
# Splitting the dataset into train and test
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(data, labels, test_size = 0.2, random_state = 42)
# Checking dimensions - (Number of Images, Width, Length, Color channels)
print("X_train shape : ", X_train.shape)
print("X_test shape : ", X_test.shape)
print("Y_train shape : ", Y_train.shape)
print("Y_test shape : ", Y_test.shape)
output:
X_train shape :  (31367, 30, 30, 3)
X_test shape :  (7842, 30, 30, 3)
Y_train shape :  (31367,)
Y_test shape :  (7842,)
# Converting integer class to binary class
from keras.utils import to_categorical
Y_train_categorical = to_categorical(Y_train, 43)
Y_test_categorical = to_categorical(Y_test, 43)

第2步：数据可视化：

将使用imshow函数使数据集中的特定图像可视化。该数据集中的图像高度为30px，宽度为30px，并具有3个颜色通道。

# Visualizing Dataset Images
i = 100
plt.imshow(X_train[i])
print("Sign category :",Y_train[i])

数据集中的图像

ConvNets背后的直觉

由于卷积神经网络能够检测和识别图像中的各种对象，因此在计算机视觉应用中非常流行。

典型的CNN架构

用外行的话来说，CNN基本上是一开始就具有卷积运算的完全连接的神经网络。这些卷积运算可用于检测图像中的定义图案。它类似于人脑枕叶中的神经元。ConvNets的体系结构使用3层构建，然后堆叠形成完整的ConvNet体系结构。以下是三层：

卷积层。
池化层。
完全连接。

卷积层：卷积层是ConvNet的核心部分，它执行所有计算量大的任务。在整个图像中遍历特定模式的内核或过滤器，以检测特定类型的特征。该遍历的输出将导致一个称为要素图的二维数组。该特征图中的每个值都通过ReLU函数传递，以消除非线性。
池化层：该层负责减少数据量，因为它减少了计算量和处理所需的时间。有两种类型的池化：平均值池和最大值池。顾名思义，“最大”池返回最大值，“平均”池返回内核覆盖的图像部分的平均值。
完全连接：上一步收到的二维输出数组通过展平过程转换为列向量。该向量被传递到多层神经网络，该网络通过一系列时期学习使用Softmax函数对图像进行分类。

步骤3：训练模型

# Importing Keras Libraries
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
# Creating Neural network Architecture
# Initialize neural network
model = Sequential()
# Add 2 convolutional layers with 32 filters, a 5x5 window, and ReLU activation function
model.add(Conv2D(filters = 32, kernel_size = (5, 5), activation = 'relu', input_shape = X_train.shape[1:]))
model.add(Conv2D(filters = 32, kernel_size = (5, 5), activation = 'relu'))
# Add max pooling layer with a 2x2 window
model.add(MaxPool2D(pool_size = (2, 2)))
# Add dropout layer
model.add(Dropout(rate = 0.25))
# Add 2 convolutional layers with 32 filters, a 5x5 window, and ReLU activation function
model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu'))
model.add(Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu'))
# Add max pooling layer with a 2x2 window
model.add(MaxPool2D(pool_size = (2, 2)))
# Add dropout layer
model.add(Dropout(rate = 0.25))
# Add layer to flatten input
model.add(Flatten())
# Add fully connected layer of 256 units with a ReLU activation function
model.add(Dense(256, activation = 'relu'))
# Add dropout layer
model.add(Dropout(rate = 0.5))
# Add fully connected layer of 256 units with a Softmax activation function
model.add(Dense(43, activation = 'softmax'))
# Summarizing the model architecture
model.summary()
output:
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 26, 26, 32)        2432      
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 22, 22, 32)        25632     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 11, 11, 32)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 11, 11, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 9, 9, 64)          18496     
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 7, 7, 64)          36928     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 3, 3, 64)          0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 3, 3, 64)          0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 576)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 256)               147712    
_________________________________________________________________
dropout_3 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 43)                11051     
=================================================================
Total params: 242,251
Trainable params: 242,251
Non-trainable params: 0
_________________________________________________________________
# Compile neural network
model.compile(loss = "categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])
# Train neural network
history = model.fit(X_train, Y_train_categorical, batch_size = 32, epochs = 15, validation_data = (X_test, Y_test_categorical))
Output after 15 epochs:
Epoch 15/15
31367/31367 [==============================] - 98s 3ms/step - loss: 0.2169 - acc: 0.9485 - val_loss: 0.0835 - val_acc: 0.9787

步骤4：模型评估：

# Ploting graph - Epoch vs Accuracy
plt.plot(history.history['acc'], label='training accuracy')
plt.plot(history.history['val_acc'], label='val accuracy')
plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.grid()
plt.legend()
plt.show()

准确性与时代

# Ploting graph - Epoch vs Loss
plt.plot(history.history['loss'], label='training loss')
plt.plot(history.history['val_loss'], label='val loss')
plt.title('Loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.grid()
plt.legend()
plt.show()

损失与时代

# Calculating Accuracy Score
from sklearn.metrics import accuracy_score
y_test = pd.read_csv('Test.csv')
labels = y_test["ClassId"].values
imgs = y_test["Path"].values
 
data = []
 
for img in imgs:
    image = Image.open(img)
    image = image.resize((30,30))
    data.append(np.array(image))
 
X_test = np.array(data)
 
pred = model.predict_classes(X_test)
 
from sklearn.metrics import accuracy_score
print("Accuracy Score : ",accuracy_score(labels, pred))
Output:
Accuracy Score :  0.9499604117181314

Github存储库：

https://github.com/Aditya-Mankar/Traffic-Sign-Recognition

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2020-02-17，如有侵权请联系 cloudcommunity@tencent.com 删除

机器学习