前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Working with QDA – a nonlinear LDA使用QDA-非线性的LDA

Working with QDA – a nonlinear LDA使用QDA-非线性的LDA

作者头像
到不了的都叫做远方
修改2020-05-06 11:43:19
5680
修改2020-05-06 11:43:19
举报

QDA is the generalization of a common technique such as quadratic regression. It is simply a generalization of the model to allow for more complex models to fit, though, like all things,when allowing complexity to creep in, we make our life more difficult.

QDA是一种一般化的普遍技术,如二次回归。它是用一种简单的一般化模型来考虑拟合更复杂的模型,正如所有事情一样,当复杂的问题出现,我们使得我们的生活更加艰难。

Getting ready准备工作

We will expand on the last recipe and look at Quadratic Discernment Analysis (QDA) via the QDA object.We said we made an assumption about the covariance of the model. Here, we will relax the assumption.

我们在前一部分的基础上扩展并且通过QDA对象看一看二次判别分析QDA,我们说过我们做了一个关于模型协方差的假设,现在我们放宽假设。

How to do it...怎么做

QDA is aptly a member of the qda module. Use the following commands to use QDA:

QDA是QDA模型里的一个适当的成员,使用以下代码来使用QDA:

from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis as QDA
qda = QDA()
qda.fit(X.ix[:, :-1], X.ix[:, -1])
predictions = qda.predict(X.ix[:, :-1])
predictions.sum()
2812.0
from sklearn.metrics import classification_report
print classification_report(X.ix[:, -1].values, predictions)

              precision    recall  f1-score   support

         0.0       0.69      0.22      0.33      3083
         1.0       0.41      0.84      0.55      1953

    accuracy                           0.46      5036
   macro avg       0.55      0.53      0.44      5036
weighted avg       0.58      0.46      0.42      5036

As you can see, it's about equal on the whole. If we look back at the LDA recipe, we can see large changes as opposed to the QDA object for class 0 and minor differences for class 1.

如你所见,整体上是等同的,如果我们看一下上一部分的LDA,我们能看到很大不同与QDA对象截然不同的0分类和很小不同的1分类。

How it works…如何运行的

Like we talked about in the last recipe, we essentially compare likelihoods here. So, how do we compare likelihoods? Let's just use the price at hand to attempt to classify is_higher.We'll assume that the closing price is log-normally distributed. In order to compute the likelihood for each class, we need to create the subsets of closes as well as a training and test set for each class. As a sneak peak to the next chapter, we'll use the built-in cross validation methods:

如我们在前一部分讨论的那样,我们本质上是对比其相似性。所以,如何对比相似性?我们只要使用手边的价格用于分类的is_higher.我们假设最近的价格log-normally分布。为了计算每个类的相似性,我们需要为每一个分类的训练集和测试集生成近似的分组,作为对下一章的预览,我们使用內建的交叉验证方法。

from sklearn.model_selection import train_test_split
import scipy.stats as sp
test, train = train_test_split(X)
train_close = train.Close

train_0 = train_close[~train.is_higher.astype(bool)]
train_1 = train_close[train.is_higher.astype(bool)]
test_close = test.Close.values

ll_0 = sp.norm.pdf(test_close, train_0.mean())
ll_1 = sp.norm.pdf(test_close, train_1.mean())

Now that we have likelihoods for both classes, we can compare and assign classes:

现在我们有了每个类之间的相似性,让我们对比并分类

(ll_0 > ll_1).mean()
0.18374371194069367

本文系外文翻译,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文系外文翻译前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
作者已关闭评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档