Decomposition to classify with DictionaryLearning降维之字典学习的分类

到不了的都叫做远方

修改于 2020-04-20 15:07:17

5910

修改于 2020-04-20 15:07:17

In this recipe, we'll show how a decomposition method can actually be used for classification. DictionaryLearning attempts to take a dataset and transform it into a sparse representation.

通过这部分，我们将展示如何把一个降维方法实际运用于分类。字典学习企图作用于一个数据集，并把它转化为一个稀疏形式

Getting ready准备工作

With DictionaryLearning , the idea is that the features are a basis for the resulting datasets. In an effort to keep this recipe short, I'll assume you have idis_data and iris_target ready to go.

对于字典学习，主要思想是特征是结果数据的基础。为了使步骤变短，我将假设你你已经准备好了idis_data 和iris_target。

How to do it...怎么做

First, import DictionaryLearning :首先导入DictionaryLearning

from sklearn.decomposition import DictionaryLearning

Next, use three components to represent the three species of iris :然后，用三个成分来代表iris的三种类：

dl = DictionaryLearning(3)

Then transform every other data point so that we can test the classifier on the resulting data points after the learner is trained:然后变换其他的数据点，以便我们能在学习器经过训练后来用结果数据集测试分类器。

transformed = dl.fit_transform(iris_data[::2])
transformed[:5]
array([[ 0. , 6.34476574, 0. ],
       [ 0. , 5.83576461, 0. ],
       [ 0. , 6.32038375, 0. ],
       [ 0. , 5.89318572, 0. ],
       [ 0. , 5.45222715, 0. ]])

We can visualize the output. Notice how each value is sited on the x, y, or z axis along with the other values and 0; this is called sparseness.

我们可以可视化输出，注意每个值如何坐落在x、y、z轴上组合另一个值以及0。这叫做稀疏化

If you look closely, you can see there was some training error. One of the classes was misclassified. Only being wrong once isn't a big deal, though.Next, let's fit (not fit_transform ) the testing set:

如果你进一步观察，你将看到一些训练误差，一个分类将是错误分类，仅有一项错误不是大问题，然而，接下来，拟合或拟合变换测试机：

transformed = dl.transform(iris_data[1::2])

The following screenshot shows its performance:接下来的截图展示了他的表现：

Notice again that there was some error in the classification. If you remember some of the other visualizations, the blue and green classes were the two classes that often appeared close together.

再次注意到一些分类的错误，如果你记得其他可视化方法，蓝色和绿色的分类其实是两类特别接近的类。

How it works...怎么工作的

DictionaryLearning has a background in signal processing and neurology. The idea is that only few features can be active at any given time. Therefore, DictionaryLearning attempts to find a suitable representation for the underlying data, given the constraint that most of the features should be 0

字典学习有信号处理和神经学方面的背景，意义是只有很少的特征在各种条件下总是有效的，然而，字典学习企图找到一种适合的代表，然后限制其他的特征均为0.

本文系外文翻译，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

ide