首页
学习
活动
专区
工具
TVP
发布

翻译scikit-learn Cookbook

学习sklearn
专栏作者
78
文章
52872
阅读量
15
订阅数
2021年战胜机器学习纸老虎-第二周 决策树(一)
大家好,很高兴来到了2021年战胜机器学习纸老虎-第二周 决策树(一)的内容时间。我一步一步从简单的例子,进入机器学习的世界。(PPT大小限制,压缩了会变形,就一页一页贴吧)
到不了的都叫做远方
2021-01-31
5620
【技术创作101训练营】2021年战胜机器学习纸老虎-第一周 关于模型的思考
接触机器学习已经3年多,书是看了几本,网课也上了不少节,但由于没有真实应用,还停留在抄代码的阶段,一直没能进步。到了21年立flag的日子,就拿出它来,与自己约定,做一个系列,坚持每周一篇,从学习到应用。
到不了的都叫做远方
2021-01-20
4470
机器学习重新思考--何为机器学习
在科技发展的今天,很多事情人们都不再亲力亲为,要么交给了专业人士,要么交给了机器。于是有一部分人称为了专业人士,一部分人成了专业忽悠,一部分人去捣鼓机器,然后一大部分人被解放出来,获得了休闲。
到不了的都叫做远方
2020-05-08
3350
数据结构学习-python实现01--0401
经过近两年多的转行自学,乱七八糟的学了不少的东西,依然没有走到自己想要去的方向,继续学习,努力吧!
到不了的都叫做远方
2020-04-01
4400
Automatic cross validation自动交叉验证
We've looked at the using cross validation iterators that scikit-learn comes with, but we can also use a helper function to perform cross validation for use automatically. This is similar to how other objects in scikit-learn are wrapped by helper functions, pipeline for instance.
到不了的都叫做远方
2019-12-10
6170
Classifying documents with Naïve Bayes使用朴素贝叶斯分类文本
Naïve Bayes is a really interesting model. It's somewhat similar to k-NN in the sense that it makes some assumptions that might oversimplify reality, but still perform well in many cases.
到不了的都叫做远方
2019-12-07
3880
Classifying data with support vector machines支持向量机用于分类数据
Support vector machines (SVM) is one of the techniques we will use that doesn't have an easy probabilistic interpretation. The idea behind SVMs is that we find the plane that separates the group of the dataset the "best". Here, separation means that the choice of the plane maximizes the margin between the closest points on the plane. These points are called support vectors.
到不了的都叫做远方
2019-12-01
4680
Using many Decision Trees – random forests使用多棵决策树--随机森林
In this recipe, we'll use random forests for classification tasks. random forests are used because they're very robust to overfitting and perform well in a variety of situations.
到不了的都叫做远方
2019-11-29
6230
Tuning a Decision Tree model调试决策树模型
If we use just the basic implementation of a Decision Tree, it will probably not fit very well.Therefore, we need to tweak the parameters in order to get a good fit. This is very easy and won't require much effort.
到不了的都叫做远方
2019-11-28
1.2K0
Doing basic classifications with Decision Trees使用决策树做基本分类
In this recipe, we will perform basic classifications using Decision Trees. These are very nice models because they are easily understandable, and once trained in, scoring is very simple.
到不了的都叫做远方
2019-11-27
3570
4 Classifying Data with scikit-learn使用scikit-learn分类数据
This chapter will cover the following topics:本章将涵盖以下主题:
到不了的都叫做远方
2019-11-27
3190
Using KMeans for outlier detection使用KMeans进行异常值检测
In this chapter, we'll look at both the debate and mechanics of KMeans for outlier detection.It can be useful to isolate some types of errors, but care should be taken when using it.
到不了的都叫做远方
2019-11-26
1.9K0
Probabilistic clustering with Gaussian Mixture Models
用基于概率的高斯混合模型聚类 In KMeans, we assume that the variance of the clusters is equal. This leads to a subdivision of space that determines how the clusters are assigned; but, what about a situation where the variances are not equal and each cluster point has som
到不了的都叫做远方
2019-11-25
6050
Finding the closest objects in the feature space在特征空间中找到最接近的对象
Sometimes, the easiest thing to do is to just find the distance between two objects. We just need to find some distance metric, compute the pairwise distances, and compare the outcomes to what's expected.
到不了的都叫做远方
2019-11-24
6500
Quantizing an image with KMeans clustering使用KMeans聚类量化图片
Image processing is an important topic in which clustering has some application.
到不了的都叫做远方
2019-11-23
1K0
估算聚类正确性&使用小批量KMeans来处理更多数据
We talked a little bit about assessing clusters when the ground truth is not known. However, we have not yet talked about assessing KMeans when the cluster is known. In a lot of cases, this isn't knowable; however, if there is outside annotation, we will know the ground truth,or at least the proxy, sometimes.
到不了的都叫做远方
2019-11-22
8110
Optimizing the number of centroids最优化形心数量
Centroids are difficult to interpret, and it can also be very difficult to determine whether we have the correct number of centroids. It's important to understand whether your data is unlabeled or not as this will directly influence the evaluation measures we can use.
到不了的都叫做远方
2019-11-21
4810
Using KMeans to cluster data使用K均值来聚类数据
Clustering is a very useful technique. Often, we need to divide and conquer when taking actions. Consider a list of potential customers for a business. A business might need to group customers into cohorts, and then departmentalize responsibilities for these cohorts.Clustering can help facilitate the clustering process.KMeans is probably one of the most well-known clustering algorithms and, in a larger sense, one of the most well-known unsupervised learning techniques.
到不了的都叫做远方
2019-11-20
7870
3 Building Models with Distance Metrics建立距离度量的模型
This chapter will cover the following topics:这章将包含如下主题:
到不了的都叫做远方
2019-11-20
3690
Directly applying Bayesian ridge regression直接使用贝叶斯岭回归
In the Using ridge regression to overcome linear regression's shortfalls recipe, we discussed the connections between the constraints imposed by ridge regression from an optimization standpoint. We also discussed the Bayesian interpretation of priors on the coefficients, which attract the mass of the density towards the prior, which often has a mean of 0 .
到不了的都叫做远方
2019-11-18
1.5K0
点击加载更多
社区活动
腾讯技术创作狂欢月
“码”上创作 21 天,分 10000 元奖品池!
Python精品学习库
代码在线跑,知识轻松学
博客搬家 | 分享价值百万资源包
自行/邀约他人一键搬运博客,速成社区影响力并领取好礼
技术创作特训营·精选知识专栏
往期视频·千货材料·成员作品 最新动态
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档