腾讯云
开发者社区
文档
建议反馈
控制台
首页
学习
活动
专区
工具
TVP
最新优惠活动
文章/答案/技术大牛
搜索
搜索
关闭
发布
登录/注册
首页
学习
活动
专区
工具
TVP
最新优惠活动
返回腾讯云官网
翻译scikit-learn Cookbook
学习sklearn
专栏作者
举报
78
文章
52849
阅读量
15
订阅数
订阅专栏
申请加入专栏
全部文章(78)
机器学习(35)
神经网络(29)
深度学习(26)
scikit-learn(19)
人工智能(17)
编程算法(13)
python(11)
线性回归(7)
numpy(6)
决策树(5)
数据分析(5)
matlab(4)
spring(4)
对象存储(2)
c++(2)
node.js(2)
css(2)
批量计算(2)
http(2)
监督学习(2)
kernel(2)
data(2)
model(2)
point(2)
tensorflow(1)
javascript(1)
bash(1)
actionscript(1)
ide(1)
api(1)
linux(1)
bash 指令(1)
网络安全(1)
面向对象编程(1)
二叉树(1)
数据结构(1)
学习方法(1)
bit(1)
class(1)
classification(1)
compare(1)
error(1)
graph(1)
key(1)
lda(1)
max(1)
models(1)
object(1)
pandas(1)
predict(1)
regression(1)
scheme(1)
svm(1)
this(1)
using(1)
xlsx(1)
递归(1)
搜索(1)
搜索文章
搜索
搜索
关闭
Automatic cross validation自动交叉验证
spring
scikit-learn
机器学习
神经网络
We've looked at the using cross validation iterators that scikit-learn comes with, but we can also use a helper function to perform cross validation for use automatically. This is similar to how other objects in scikit-learn are wrapped by helper functions, pipeline for instance.
到不了的都叫做远方
2019-12-10
617
0
Using many Decision Trees – random forests使用多棵决策树--随机森林
scikit-learn
机器学习
神经网络
深度学习
人工智能
In this recipe, we'll use random forests for classification tasks. random forests are used because they're very robust to overfitting and perform well in a variety of situations.
到不了的都叫做远方
2019-11-29
623
0
Probabilistic clustering with Gaussian Mixture Models
scikit-learn
机器学习
http
神经网络
深度学习
用基于概率的高斯混合模型聚类 In KMeans, we assume that the variance of the clusters is equal. This leads to a subdivision of space that determines how the clusters are assigned; but, what about a situation where the variances are not equal and each cluster point has som
到不了的都叫做远方
2019-11-25
605
0
Finding the closest objects in the feature space在特征空间中找到最接近的对象
scikit-learn
机器学习
神经网络
深度学习
人工智能
Sometimes, the easiest thing to do is to just find the distance between two objects. We just need to find some distance metric, compute the pairwise distances, and compare the outcomes to what's expected.
到不了的都叫做远方
2019-11-24
649
0
Quantizing an image with KMeans clustering使用KMeans聚类量化图片
scikit-learn
机器学习
神经网络
深度学习
Image processing is an important topic in which clustering has some application.
到不了的都叫做远方
2019-11-23
1K
0
Directly applying Bayesian ridge regression直接使用贝叶斯岭回归
scikit-learn
机器学习
神经网络
深度学习
人工智能
In the Using ridge regression to overcome linear regression's shortfalls recipe, we discussed the connections between the constraints imposed by ridge regression from an optimization standpoint. We also discussed the Bayesian interpretation of priors on the coefficients, which attract the mass of the density towards the prior, which often has a mean of 0 .
到不了的都叫做远方
2019-11-18
1.5K
0
Using sparsity to regularize models使用稀疏性来正则化模型
线性回归
scikit-learn
机器学习
神经网络
深度学习
The least absolute shrinkage and selection operator (LASSO) method is very similar to ridge regression and LARS. It's similar to Ridge Regression in the sense that we penalize our regression by some amount, and it's similar to LARS in that it can be used as a parameter selection, and it typically leads to a sparse vector of coefficients.
到不了的都叫做远方
2019-11-14
512
0
Optimizing the ridge regression parameter最优化岭回归参数
scikit-learn
机器学习
神经网络
深度学习
人工智能
Once you start using ridge regression to make predictions or learn about relationships in the system you're modeling, you'll start thinking about the choice of alpha.For example, using OLS regression might show some relationship between two variables;however, when regularized by some alpha, the relationship is no longer significant. This can be a matter of whether a decision needs to be taken.
到不了的都叫做远方
2019-11-13
1.5K
0
Using stochastic gradient descent for regression使用随机梯度下降进行回归分析
scikit-learn
机器学习
神经网络
深度学习
人工智能
In this recipe, we'll get our first taste of stochastic gradient descent. We'll use it for regression here, but for the next recipe, we'll use it for classification.
到不了的都叫做远方
2019-11-09
541
0
Using Gaussian processes for regression降维之高斯过程
scikit-learn
机器学习
神经网络
深度学习
人工智能
In this recipe, we'll use the Gaussian process for regression. In the linear models section,we saw how representing prior information on the coefficients was possible using Bayesian Ridge Regression.
到不了的都叫做远方
2019-11-07
977
0
Using truncated SVD to reduce dimensionality使用截断奇异值进行降维
数据分析
scikit-learn
机器学习
神经网络
Truncated Singular Value Decomposition (SVD) is a matrix factorization technique that factors a matrix M into the three matrices U, Σ, and V. This is very similar to PCA, excepting that the factorization for SVD is done on the data matrix, whereas for PCA, the factorization is done on the covariance matrix. Typically, SVD is used under the hood to find the principle components of a matrix.
到不了的都叫做远方
2019-11-03
2.1K
0
Reducing dimensionality with PCA主成分分析之降维
数据分析
scikit-learn
机器学习
神经网络
深度学习
Now it's time to take the math up a level! Principal component analysis (PCA) is the first somewhat advanced technique discussed in this book. While everything else thus far has been simple statistics, PCA will combine statistics and linear algebra to produce a preprocessing step that can help to reduce dimensionality, which can be the enemy of a simple model.
到不了的都叫做远方
2019-10-31
747
0
使用Pipelines来整合多个数据预处理步骤
css
scikit-learn
机器学习
神经网络
深度学习
Pipelines are (at least to me) something I don't think about using often, but are useful.They can be used to tie together many steps into one object. This allows for easier tuning and better access to the configuration of the entire model, not just one of the steps.
到不了的都叫做远方
2019-10-30
1.6K
0
Imputing missing values through various strategies填充处理缺失值的不同方法
scikit-learn
机器学习
神经网络
深度学习
人工智能
Data imputation is critical in practice, and thankfully there are many ways to deal with it.In this recipe, we'll look at a few of the strategies. However, be aware that there might be other approaches that fit your situation better.
到不了的都叫做远方
2019-10-30
836
0
Working with categorical variables处理分类变量
scikit-learn
机器学习
神经网络
深度学习
人工智能
Categorical variables are a problem. On one hand they provide valuable information; on the other hand, it's probably text—either the actual text or integers corresponding to the text—like an index in a lookup table.So, we clearly need to represent our text as integers for the model's sake, but we can't just use the id field or naively represent them. This is because we need to avoid a similar problem to the Creating binary features through thresholding recipe. If we treat data that is continuous, it must be interpreted as continuous.
到不了的都叫做远方
2019-10-29
805
0
Creating binary features through thresholding通过阈值来生成二元特征
scikit-learn
机器学习
神经网络
深度学习
人工智能
In the last recipe, we looked at transforming our data into the standard normal distribution.Now, we'll talk about another transformation, one that is quite different.
到不了的都叫做远方
2019-10-28
419
0
Scaling data to the standard normal缩放数据到标准正态形式
c++
面向对象编程
scikit-learn
机器学习
神经网络
A preprocessing step that is almost recommended is to scale columns to the standard normal. The standard normal is probably the most important distribution of all statistics.
到不了的都叫做远方
2019-10-27
1.2K
0
scikit-learn Cookbook 01
scikit-learn
numpy
机器学习
神经网络
深度学习
I will again implore you to use some of your own data for this book, but in the event you cannot,we'll learn how we can use scikit-learn to create toy data.
到不了的都叫做远方
2019-10-26
399
0
scikit-learn Cookbook 00
scikit-learn
机器学习
神经网络
深度学习
人工智能
This chapter discusses setting data, preparing data, and premodel dimensionality reduction.These are not the
到不了的都叫做远方
2019-10-25
414
0
没有更多了
社区活动
腾讯技术创作狂欢月
“码”上创作 21 天,分 10000 元奖品池!
立即发文
Python精品学习库
代码在线跑,知识轻松学
立即查看
博客搬家 | 分享价值百万资源包
自行/邀约他人一键搬运博客,速成社区影响力并领取好礼
立即体验
技术创作特训营·精选知识专栏
往期视频·千货材料·成员作品 最新动态
立即查看
领券
问题归档
专栏文章
快讯文章归档
关键词归档
开发者手册归档
开发者手册 Section 归档