OpenCV和SVM分类器在自动驾驶中的车辆检测

原创

刘盼

发布于 2018-03-01 11:20:29

2K5

发布于 2018-03-01 11:20:29

文章被收录于专栏：人人都是极客人人都是极客

这次文章的车辆检测在车辆感知模块中是非常重要的功能，本节课我们的目标如下：

在标记的图像训练集上进行面向梯度的直方图（HOG）特征提取并训练分类器线性SVM分类器
应用颜色转换，并将分箱的颜色特征以及颜色的直方图添加到HOG特征矢量中
对于上面两个步骤，不要忘记标准化您的功能，并随机选择一个用于训练和测试的选项
实施滑动窗口技术，并使用您训练的分类器搜索图像中的车辆
在视频流上运行流水线（从test_video.mp4开始，稍后在完整的project_video.mp4中实现），并逐帧创建循环检测的热图，以拒绝异常值并跟踪检测到的车辆
估算检测到的车辆的边界框

定向梯度直方图（HOG）

定向梯度直方图（HOG）是计算机视觉和图像处理中用于目标检测的特征描述符。该技术计算图像的局部部分中梯度定向的发生。这种方法类似于边缘方向直方图，尺度不变特征变换描述符和形状上下文，但不同之处在于它是在均匀间隔的单元的密集网格上计算的，并使用重叠的局部对比度归一化来提高准确性。

从训练图像中提取HOG特征

此步骤的代码包含在方法“get_hog_features”中的文件vehicle_detection.py中

# Define a function to return HOG features and visualizationdef get_hog_features(self, img, orient, pix_per_cell, cell_per_block, 
                    vis=False, feature_vec=True):# Call with two outputs if vis==Trueif vis == True:
    features, hog_image = hog(img, orientations=orient, 
                              pixels_per_cell=(pix_per_cell, pix_per_cell),
                              cells_per_block=(cell_per_block, cell_per_block), 
                              transform_sqrt=True, 
                              visualise=vis, feature_vector=feature_vec)    return features, hog_image# Otherwise call with one outputelse:      
    features = hog(img, orientations=orient, 
                   pixels_per_cell=(pix_per_cell, pix_per_cell),
                   cells_per_block=(cell_per_block, cell_per_block), 
                   transform_sqrt=True, 
                   visualise=vis, feature_vector=feature_vec)    return features

我开始阅读所有的车辆和非车辆图像。这里是每一个中的一个的一个例子vehicle和non-vehicle类：

然后我探索不同的色彩空间和不同的skimage.hog()参数（orientations，pixels_per_cell，和cells_per_block）。我从两个类的每一个中抓取随机图像，并显示它们，以感受skimage.hog()输出的样子。使用后的图像以不同的对比度和光度多次试验，它的工作最好使用YCrCb相结合的色彩空间与HOG提取的特征orientations=9，pixels_per_cell=(8, 8)和cells_per_block=(2, 2)。

这是一个使用YCrCb色彩空间和HOG参数的例子orientations=9，pixels_per_cell=(8, 8)并且cells_per_block=(2, 2)：

提取颜色特征的空间分级

为了使算法在识别汽车时更加稳健，HOG特征还增加了一种新的特征。除非你确切地知道你的目标对象是什么样子，否则模板匹配不是一个特别可靠的寻找车辆的方法。但是，原始像素值在搜索汽车中包含在您的特征向量中仍然非常有用。

虽然包含全分辨率图像的三个颜色通道可能很麻烦，但是我们可以对图像执行空间分级，并且仍然保留足够的信息来帮助查找车辆。

正如你在下面的例子中看到的那样，即使一路下降到32×32像素分辨率，汽车本身仍然可以被眼睛清楚地识别，这意味着相关特征仍然保留在这个分辨率下。

OpenCV的cv2.resize（）是一个方便的缩小图像分辨率的函数。

# Define a function to compute binned color features  
    def bin_spatial(self, img, size=(32, 32)):
        # Use cv2.resize().ravel() to create the feature vector
        #features = cv2.resize(img, size).ravel() 
        # Return the feature vector
        #return features
        color1 = cv2.resize(img[:,:,0], size).ravel()
        color2 = cv2.resize(img[:,:,1], size).ravel()
        color3 = cv2.resize(img[:,:,2], size).ravel()        return np.hstack((color1, color2, color3))

提取颜色特征的直方图

在这个项目中使用的另一个技术，使更多的功能是颜色强度的直方图，如下图所示。

并执行如下所示：

# Define a function to compute color histogram features 
    # NEED TO CHANGE bins_range if reading .png files with mpimg!
    def color_hist(self, img, nbins=32, bins_range=(0, 256)):
        # Compute the histogram of the color channels separately
        channel1_hist = np.histogram(img[:,:,0], bins=nbins, range=bins_range)
        channel2_hist = np.histogram(img[:,:,1], bins=nbins, range=bins_range)
        channel3_hist = np.histogram(img[:,:,2], bins=nbins, range=bins_range)        # Concatenate the histograms into a single feature vector
        hist_features = np.concatenate((channel1_hist[0], channel2_hist[0], channel3_hist[0]))        # Return the individual histograms, bin_centers and feature vector
        return hist_features

合并和规范化的功能

现在我们的工具箱中已经有了几个特征提取方法，我们几乎已经准备好对分类器进行训练了，但是首先，就像在任何机器学习应用程序中一样，我们需要规范化数据。Python的sklearn包为您提供了StandardScaler（）方法来完成这个任务。要详细了解如何使用StandardScaler（）方法选择不同的标准化，请后台留言查阅文档。

将单个图像的所有不同特征组合为一组特征：

def convert_color(self, image, color_space='RGB'):
        if color_space == 'HSV':
            image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)        elif color_space == 'LUV':
            image = cv2.cvtColor(image, cv2.COLOR_RGB2LUV)        elif color_space == 'HLS':
            image = cv2.cvtColor(image, cv2.COLOR_RGB2HLS)        elif color_space == 'YUV':
            image = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)        elif color_space == 'YCrCb':
            image = cv2.cvtColor(image, cv2.COLOR_RGB2YCrCb)        return image    # Define a function to extract features from a list of images
    # Have this function call bin_spatial() and color_hist()
    def extract_features(self, image, color_space='RGB', spatial_size=(32, 32),
                            hist_bins=32, orient=9, 
                            pix_per_cell=8, cell_per_block=2, hog_channel=0,
                            spatial_feat=True, hist_feat=True, hog_feat=True):
        file_features = []        # apply color conversion if other than 'RGB'
        if color_space != 'RGB':
            feature_image = self.convert_color(image, color_space)        else: feature_image = np.copy(image)      

        if spatial_feat == True:
            spatial_features = self.bin_spatial(feature_image, size=spatial_size)
            file_features.append(spatial_features)        if hist_feat == True:            # Apply color_hist()
            hist_features = self.color_hist(feature_image, nbins=hist_bins)
            file_features.append(hist_features)        if hog_feat == True:        # Call get_hog_features() with vis=False, feature_vec=True
            if hog_channel == 'ALL':
                hog_features = []                for channel in range(feature_image.shape[2]):
                    hog_features.append(self.get_hog_features(feature_image[:,:,channel], 
                                        orient, pix_per_cell, cell_per_block, 
                                        vis=False, feature_vec=True))
                hog_features = np.ravel(hog_features)        
            else:
                hog_features = self.get_hog_features(feature_image[:,:,hog_channel], orient, 
                            pix_per_cell, cell_per_block, vis=False, feature_vec=True)            # Append the new feature vector to the features list
            file_features.append(hog_features)        return file_features

规范化是需要避免一些功能类型更重要的其他：

# Extract features of from all not-car images
        notcar_features = []        for file in notcar_filenames:
            # Read in each one by one
            image = mpimg.imread(file)
            features = self.extract_features(image, color_space=self.color_space, 
                                spatial_size=self.spatial_size, hist_bins=self.hist_bins, 
                                orient=self.orient, pix_per_cell=self.pix_per_cell, 
                                cell_per_block=self.cell_per_block, 
                                hog_channel=self.hog_channel, spatial_feat=self.spatial_feat, 
                                hist_feat=self.hist_feat, hog_feat=self.hog_feat)
            notcar_features.append(np.concatenate(features))

        X = np.vstack((car_features, notcar_features)).astype(np.float64)                        

        # Fit a per-column scaler        self.X_scaler = StandardScaler().fit(X)
        # Apply the scaler to X
        scaled_X = self.X_scaler.transform(X)

使用规范化的功能来训练分类器

我使用2类图像车辆和非车辆图像训练了一个线性支持向量机。首先加载图像，然后提取归一化的特征，并在2个数据集中训练（80％）和测试（20％）中的混洗和分裂。在使用StandardScaler（）训练分类器之前，将特征缩放到零均值和单位方差。源代码可以在vechicle_detection.py中找到

def __train(self):
        print ('Training the model ...')        # Read in and make a list of calibration images
        car_filenames = glob.glob(self.__train_directory+'/vehicles/*/*')  
        notcar_filenames = glob.glob(self.__train_directory+'/non-vehicles/*/*')               

        # Extract features of from all car images
        car_features = []        for file in car_filenames:
            # Read in each one by one
            image = mpimg.imread(file)
            features = self.extract_features(image, color_space=self.color_space, 
                                spatial_size=self.spatial_size, hist_bins=self.hist_bins, 
                                orient=self.orient, pix_per_cell=self.pix_per_cell, 
                                cell_per_block=self.cell_per_block, 
                                hog_channel=self.hog_channel, spatial_feat=self.spatial_feat, 
                                hist_feat=self.hist_feat, hog_feat=self.hog_feat)
            car_features.append(np.concatenate(features))        # Extract features of from all not-car images
        notcar_features = []        for file in notcar_filenames:
            # Read in each one by one
            image = mpimg.imread(file)
            features = self.extract_features(image, color_space=self.color_space, 
                                spatial_size=self.spatial_size, hist_bins=self.hist_bins, 
                                orient=self.orient, pix_per_cell=self.pix_per_cell, 
                                cell_per_block=self.cell_per_block, 
                                hog_channel=self.hog_channel, spatial_feat=self.spatial_feat, 
                                hist_feat=self.hist_feat, hog_feat=self.hog_feat)
            notcar_features.append(np.concatenate(features))        X = np.vstack((car_features, notcar_features)).astype(np.float64)                        

        # Fit a per-column scaler
        self.X_scaler = StandardScaler().fit(X)        # Apply the scaler to X
        scaled_X = self.X_scaler.transform(X)        # Define the labels vector
        y = np.hstack((np.ones(len(car_features)), np.zeros(len(notcar_features))))        # Split up data into randomized training and test sets
        rand_state = np.random.randint(0, 100)        X_train, X_test, y_train, y_test = train_test_split(scaled_X, y, test_size=0.2, random_state=rand_state)

        print('Using:',self.orient,'orientations',self.pix_per_cell,            'pixels per cell and', self.cell_per_block,'cells per block')
        print('Feature vector length:', len(X_train[0]))        # Use a linear SVC 
        self.svc = LinearSVC()        self.svc.fit(X_train, y_train)        # Check the score of the SVC
        print('Test Accuracy of SVC = ', round(self.svc.score(X_test, y_test), 4))        # Pickle to save time for subsequent runs
        binary = {}
        binary["svc"] = self.svc
        binary["X_scaler"] = self.X_scaler
        pickle.dump(binary, open(self.__train_directory + '/' + self.__binary_filename, "wb"))    def __load_binary(self):
        '''Load previously computed trained classifier'''
        with open(self.__train_directory + '/' + self.__binary_filename, mode='rb') as f:
            binary = pickle.load(f)        self.svc = binary['svc']        self.X_scaler = binary['X_scaler']    def get_data(self):
        '''Getter for the trained data. At the first call it gerenates it.'''
        if os.path.isfile(self.__train_directory + '/' + self.__binary_filename):
            self.__load_binary()        else:
            self.__train()        return self.svc, self.X_scaler

整个数据集（列车+测试）在车辆和非车辆之间均匀分布有17.767个项目。训练完成后，train.p被保存在子文件夹列中的磁盘上，供以后重新使用。训练好的线性支持向量机分类器在测试数据集上的准确性相当高〜0.989

滑动窗口搜索

我决定使用重叠的滑动窗口搜索来搜索图像下部的车辆。只需要搜索下面的部分，以避免搜索天空中的车辆，并使算法更快。窗口大小为64像素，每个单元8个单元和8个像素。在每张幻灯片窗户移动2个单元向右或向下。为了避免每个窗口反复提取特征，搜索速度更快，特征提取只进行一次，滑动窗口只使用该部分图像。如果窗户在长短途容纳所有车辆时具有不同的比例尺，则检测也可以更加稳健。

该实现可以在vehicle_detection.py中找到：

# Define a single function that can extract features using hog sub-sampling and make predictions
    def find_cars(self, img, plot=False):

        bbox_list = [] 

        draw_img = np.copy(img)
        img = img.astype(np.float32)/255
        img_tosearch = img[self.ystart:self.ystop,:,:]
        ctrans_tosearch = self.convert_color(img_tosearch, color_space='YCrCb')        if self.scale != 1:
            imshape = ctrans_tosearch.shape
            ctrans_tosearch = cv2.resize(ctrans_tosearch, (np.int(imshape[1]/self.scale), np.int(imshape[0]/self.scale)))

        ch1 = ctrans_tosearch[:,:,0]
        ch2 = ctrans_tosearch[:,:,1]
        ch3 = ctrans_tosearch[:,:,2]        # Define blocks and steps as above
        nxblocks = (ch1.shape[1] // self.pix_per_cell)-1
        nyblocks = (ch1.shape[0] // self.pix_per_cell)-1 
        nfeat_per_block = self.orient*self.cell_per_block**2
        # 64 was the orginal sampling rate, with 8 cells and 8 pix per cell
        window = 64
        nblocks_per_window = (window // self.pix_per_cell)-1 
        cells_per_step = 2  # Instead of overlap, define how many cells to step
        nxsteps = (nxblocks - nblocks_per_window) // cells_per_step
        nysteps = (nyblocks - nblocks_per_window) // cells_per_step

        # Compute individual channel HOG features for the entire image
        hog1 = self.get_hog_features(ch1, self.orient, self.pix_per_cell, self.cell_per_block, feature_vec=False)
        hog2 = self.get_hog_features(ch2, self.orient, self.pix_per_cell, self.cell_per_block, feature_vec=False)
        hog3 = self.get_hog_features(ch3, self.orient, self.pix_per_cell, self.cell_per_block, feature_vec=False)

        bbox_all_list = [] 
        for xb in range(nxsteps+1):            for yb in range(nysteps):
                ypos = yb*cells_per_step
                xpos = xb*cells_per_step                # Extract HOG for this patch
                hog_feat1 = hog1[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel() 
                hog_feat2 = hog2[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel() 
                hog_feat3 = hog3[ypos:ypos+nblocks_per_window, xpos:xpos+nblocks_per_window].ravel() 
                hog_features = np.concatenate((hog_feat1, hog_feat2, hog_feat3))

                xleft = xpos*self.pix_per_cell
                ytop = ypos*self.pix_per_cell                # Extract the image patch
                subimg = cv2.resize(ctrans_tosearch[ytop:ytop+window, xleft:xleft+window], (64,64))                # Get color features
                spatial_features = self.bin_spatial(subimg, size=self.spatial_size)
                hist_features = self.color_hist(subimg, nbins=self.hist_bins)                # Scale features and make a prediction
                test_features = self.X_scaler.transform(np.hstack((spatial_features, hist_features, hog_features)).reshape(1, -1))    
                test_prediction = self.svc.predict(test_features)                # compute current seize of the window
                xbox_left = np.int(xleft*self.scale)
                ytop_draw = np.int(ytop*self.scale)
                win_draw = np.int(window*self.scale)
                bbox = ((xbox_left, ytop_draw+self.ystart),(xbox_left+win_draw,ytop_draw+win_draw+self.ystart))                if test_prediction == 1:
                    bbox_list.append(bbox)

                bbox_all_list.append(bbox)        if(plot==True):
            draw_img_detected = np.copy(draw_img)            # draw all all searched windows
            for bbox in bbox_all_list:
                cv2.rectangle(draw_img, bbox[0], bbox[1], (0,0,255), 3) 
            for bbox in bbox_list:
                cv2.rectangle(draw_img_detected, bbox[0], bbox[1], (0,0,255), 3) 

            fig = plt.figure()
            plt.subplot(121)
            plt.imshow(draw_img)
            plt.title('Searched sliding windows')
            plt.subplot(122)
            plt.imshow(draw_img_detected, cmap='hot')
            plt.title('Detected vechicle windows')
            fig.tight_layout()
            plt.show()        return bbox_list


    def draw_labeled_bboxes(self, img, labels):        # Iterate through all detected cars
        for car_number in range(1, labels[1]+1):            # Find pixels with each car_number label value
            nonzero = (labels[0] == car_number).nonzero()            # Identify x and y values of those pixels
            nonzeroy = np.array(nonzero[0])
            nonzerox = np.array(nonzero[1])            # Define a bounding box based on min/max x and y
            bbox = ((np.min(nonzerox), np.min(nonzeroy)), (np.max(nonzerox), np.max(nonzeroy)))            # Draw the box on the image
            cv2.rectangle(img, bbox[0], bbox[1], (0,0,255), 6)        # Return the image
        return img

从图中可以看出，2辆车正确检测到，但也有一些误报。

为了避免误报，使用热图。点击地图加窗，重叠的窗口有更高的价值。超过一定的阈值的值保持为真正的正值。

def add_heat(self, heatmap, bbox_list):
        # Iterate through list of bboxes
        for box in bbox_list:            # Add += 1 for all pixels inside each bbox
            # Assuming each "box" takes the form ((x1, y1), (x2, y2))
            heatmap[box[0][1]:box[1][1], box[0][0]:box[1][0]] += 1

        # Return updated heatmap
        return heatmap# Iterate through list of bboxes

    def apply_threshold(self, heatmap, threshold):
        # Zero out pixels below the threshold
        heatmap[heatmap <= threshold] = 0
        # Return thresholded map
        return heatmap
box_list = vehicle_detector.find_cars(image, plot=plot)heat = np.zeros_like(image[:,:,0]).astype(np.float)# Add heat to each box in box listheat = vehicle_detector.add_heat(heat, box_list)# Apply threshold to help remove false positivesheat = vehicle_detector.apply_threshold(heat,1)# Visualize the heatmap when displaying    heatmap = np.clip(heat, 0, 255)

要从热图找到最终的框，使用标签功能。

from scipy.ndimage.measurements import label
# Find final boxes from heatmap using label functionlabels = label(heatmap)if(plot==True):
    #print(labels[1], 'cars found')
    plt.imshow(labels[0], cmap='gray')
    plt.show()

管道处理一个图像

如下面的代码所示，首先我们提取边界框，包括真和假的正面。然后使用热图我们丢弃误报。在使用该scipy.ndimage.measurements.label()方法计算最终的框之后。最后，这些框被渲染。

def process_image(image, plot=False):

    box_list = vehicle_detector.find_cars(image, plot=plot)

    heat = np.zeros_like(image[:,:,0]).astype(np.float)    # Add heat to each box in box list
    heat = vehicle_detector.add_heat(heat, box_list)    # Apply threshold to help remove false positives
    heat = vehicle_detector.apply_threshold(heat,1)    # Visualize the heatmap when displaying    
    heatmap = np.clip(heat, 0, 255)    # Find final boxes from heatmap using label function
    labels = label(heatmap)    if(plot==True):        #print(labels[1], 'cars found')
        plt.imshow(labels[0], cmap='gray')
        plt.show()

    new_image = vehicle_detector.draw_labeled_bboxes(image, labels)    if(plot==True):
        fig = plt.figure()
        plt.subplot(121)
        plt.imshow(new_image)
        plt.title('Car Positions')
        plt.subplot(122)
        plt.imshow(heatmap, cmap='hot')
        plt.title('Heat Map')
        fig.tight_layout()
        plt.show()    return new_imagedef process_test_images(vehicle_detector, plot=False):

    test_filenames = glob.glob(TEST_DIRECTORY+'/'+TEST_FILENAME) 
    # Process each test image
    for image_filename in test_filenames:        # Read in each image
        image = cv2.imread(image_filename)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # RGB is standard in matlibplot

        image = process_image(image, plot)

        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # RGB is standard in matlibplot
        cv2.imwrite(OUTPUT_DIRECTORY+'/'+image_filename.split('/')[-1], image)

这是测试图像之一的流水线的结果：

管道处理一个视频

process_image(image, plot=False)在视频处理中使用了用于处理一个图像的相同流水线。每个帧都从视频中提取，由图像管道处理，并使用VideoFileClip和ffmpeg合并到最终的视频中

from moviepy.editor import VideoFileClip
def process_video(video_filename, vehicle_detector, plot=False):
    video_input = VideoFileClip(video_filename + ".mp4")
    video_output = video_input.fl_image(process_image)
    video_output.write_videofile(video_filename + "_output.mp4", audio=False)process_test_images(vehicle_detector, plot=False)

后台留言回复可以查看视频结果及所有源代码。

结论

当前使用SVM分类器的实现对于测试的图像和视频来说工作良好，这主要是因为图像和视频被记录在类似的环境中。用一个非常不同的环境测试这个分类器不会有类似的好结果。使用深度学习和卷积神经网络的更健壮的分类器将更好地推广到未知数据。

当前实现的另一个问题是在视频处理流水线中不考虑后续帧。保持连续帧之间的热图将更好地丢弃误报。

目前的实施还有一个更大的改进是多尺寸滑动窗口，这将更好地概括查找短距离和长距离的车辆。

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

其他

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

其他

登录后参与评论

0 条评论

热度