【Matlab机器学习】之图像识别

量化投资与机器学习微信公众号

发布于 2018-01-29 13:07:00

2.1K0

发布于 2018-01-29 13:07:00

文章被收录于专栏：量化投资与机器学习

1.Classification in the Presence of Missing Data

2.Handwriting Recognition Using Bagged Classification Trees

【代码及其数据点击阅读原文下载】

目标分类是一个重要的任务，在许多计算机视觉应用，包括监控、汽车安全、和图像检索。例如，在汽车安全应用程序，您可能需要将附近的物体，如行人或车辆。无论对象的分类类型，创建对象分类的基本程序是：

获得一个标记的数据集所需的对象的图像。
分区数据集分成训练集和测试集。
训练分类器使用从训练集的特征提取。
测试使用的测试集的分类特征提取。

Digit Data Set

For training, synthetic images are created using the insertText function from the Computer Vision System Toolbox™. The training images each contain a digit surrounded by other digits, which mimics how digits are normally seen together. Using synthetic images is convenient and it enables the creation of a variety of training samples without having to manually collect them. For testing, scans of handwritten digits are used to validate how well the classifier performs on data that is different than the synthetic training data. Although this is not the most representative data set, there is enough data to train and test a classifier, and show the feasibility of the approach.

trainingImages is a 200-by-10 cell array of training image file names; each column contains both the positive and negative training images for a digit. trainingLabels is a 200-by-10 matrix containing a label for each image in the trainingImage cell array. The labels are logical values indicating whether or not the image is a positive instance or a negative instance for a digit. testImages is a 12-by-10 cell array containing the image file names of the handwritten digit images. There are 12 examples per digit.

Note that prior to training and testing a classifier the following pre-processing step is applied to images from this dataset:

This pre-processing step removes noise artifacts introduced while collecting the image samples and helps provide better feature vectors for training the classifier. For example, the output of this pre-processing step on a couple of training and test images is shown next:

Using HOG Features

The data used to train the SVM classifier are HOG feature vectors extracted from the training images. Therefore, it is important to make sure the HOG feature vector encodes the right amount of information about the object. The extractHOGFeatures function returns a visualization output that can help form some intuition about just what the "right amount of information" means. By varying the HOG cell size parameter and visualizing the result, you can see the effect the cell size parameter has on the amount of shape information encoded in the feature vector:

The visualization shows that a cell size of [8 8] does not encode much shape information, while a cell size of [2 2] encodes a lot of shape information but increases the dimensionality of the HOG feature vector significantly. A good compromise is a 4-by-4 cell size. This size setting encodes enough spatial information to visually identify a digit shape while limiting the number of dimensions in the HOG feature vector, which helps speed up training. In practice, the HOG parameters should be varied with repeated classifier training and testing to identify the optimal parameter settings.

Train the Classifier

Digit classification is a multi-class classification problem, where you have to classify an object into one out of the ten possible digit classes. The SVM algorithm in the Statistics and Machine Learning Toolbox™, however, produces a binary classifier, which means that it is able to classify an object into one of two classes. In order to use a binary SVM for digit classification, 10 such classifiers are required; each one trained for a specific digit. This is a common technique used to solve multi-class classification problems with binary classifiers and is known as "one-versus-all" or "one-versus-rest" classification.

Test the Classifier

Now the SVM classifiers can be tested using the handwritten digit images shown earlier.

Results

The columns of the table contain the classification results for each SVM classifier. Ideally, the table would be a diagonal matrix, where each diagonal element equals the number of images per digit (12 in this example). Based on this data set, digit 1, 2, 3, and 4 are easier to recognize compared to digit 6, where there are many false positives. Using more representative data sets like MNIST [2] or SVHN [3], which contain thousands of handwritten characters, is likely to produce a better classifier compared with the one created using this example data set.

Summary

This example illustrated the basic procedure for creating an object classifier using the extractHOGfeatures function from the Computer Vision System Toolbox and the svmclassify and svmtrain functions from the Statistics and Machine Learning Toolbox™. Although HOG features and SVM classifiers were used here, other features and machine learning algorithms can be used in the same way. For instance, you can explore using different feature types for training the classifier; or you can see the effect of using other machine learning algorithms available in the Statistics and Machine Learning Toolbox™ such as k-nearest neighbors.

Appendix - Helper functions

This example shows how to recognize handwritten digits using an ensemble of bagged classification trees. Images of handwritten digits are first used to train a single classification tree and then an ensemble of 200 decision trees. The classification performance of each is compared to one another using a confusion matrix.

Load Training and Test Data