# 十种深度学习算法要点及代码解析

1、 监督式学习

2、非监督式学习

3、强化学习

SVM

K最近邻算法

K均值算法

1、线性回归

Y：因变量

a：斜率

x：自变量

b ：截距

Python 代码

#Import Library#Import other necessary libraries like pandas, numpy...fromsklearnimportlinear_model#Load Train and Test datasets#Identify feature and response variable(s) and values must be numeric and numpy arraysx_train=input_variables_values_training_datasetsy_train=target_variables_values_training_datasetsx_test=input_variables_values_test_datasets# Create linear regression objectlinear = linear_model.LinearRegression()# Train the model using the training sets and check scorelinear.fit(x_train, y_train)linear.score(x_train, y_train)#Equation coefficient and Interceptprint('Coefficient: n', linear.coef_)print('Intercept: n', linear.intercept_)#Predict Outputpredicted= linear.predict(x_test)R代码

#Load Train and Test datasets#Identify feature and response variable(s) and values must be numeric and numpy arraysx_train2、逻辑回归

odds= p/ (1-p) = probabilityofevent occurrence / probabilityofnotevent occurrenceln(odds)=ln(p/(1-p))logit(p)=ln(p/(1-p)) =b0+b1X1+b2X2+b3X3....+bkXk

Python代码

#Import Libraryfromsklearn.linear_modelimportLogisticRegression#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create logistic regression objectmodel = LogisticRegression()# Train the model using the training sets and check scoremodel.fit(X, y)model.score(X, y)#Equation coefficient and Interceptprint('Coefficient: n', model.coef_)print('Intercept: n', model.intercept_)#Predict Outputpredicted= model.predict(x_test)R代码

x 更进一步：

3、决策树

Python代码

#Import Library#Import other necessary libraries like pandas, numpy...fromsklearnimporttree#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create tree objectmodel = tree.DecisionTreeClassifier(criterion='gini')# for classification, here you can change the algorithm as gini or entropy (information gain) by default it is gini# model = tree.DecisionTreeRegressor() for regression# Train the model using the training sets and check scoremodel.fit(X, y)model.score(X, y)#Predict Outputpredicted= model.predict(x_test)R代码

library(rpart)x 4、支持向量机

Python代码

#Import Libraryfromsklearnimportsvm#Assumed you have, X (predictor)andY (target)fortraining data setandx_test(predictor)oftest_dataset# Create SVM classification objectmodel = svm.svc()# there is various option associated with it, this is simple for classification. You can refer link, for mo# re detail.# Train the model using the training sets and check scoremodel.fit(X, y)model.score(X, y)#Predict Outputpredicted= model.predict(x_test)R代码

library(e1071)x 5、朴素贝叶斯

P ( c|x ) 是已知预示变量（属性）的前提下，类（目标）的后验概率

P ( c ) 是类的先验概率

P ( x|c ) 是可能性，即已知类的前提下，预示变量的概率

P ( x ) 是预示变量的先验概率

Python代码

#Import Libraryfromsklearn.naive_bayesimportGaussianNB#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create SVM classification object model = GaussianNB() # there is other distribution for multinomial classes like Bernoulli Naive Bayes, Refer link# Train the model using the training sets and check scoremodel.fit(X, y)#Predict Outputpredicted= model.predict(x_test)R代码

library(e1071)x 6、KNN（K – 最近邻算法）

KNN 的计算成本很高。

Python代码

#ImportLibraryfrom sklearn.neighbors import KNeighborsClassifier#Assumed you have, X (predictor)andY (target)fortraining datasetandx_test(predictor)oftest_dataset# Create KNeighbors classifierobjectmodel KNeighborsClassifier(n_neighbors=6) #defaultvalueforn_neighborsis5# Train the model using the training setsandcheck scoremodel.fit(X, y)#Predict Outputpredicted= model.predict(x_test)R代码

library(knn)x 7、K 均值算法

K – 均值算法是一种非监督式学习算法，它能解决聚类问题。使用 K – 均值算法来将一个数据归入一定数量的集群（假设有 k 个集群）的过程是简单的。一个集群内的数据点是均匀齐次的，并且异于别的集群。

K – 均值算法怎样形成集群：

K – 均值算法给每个集群选择k个点。这些点称作为质心。

K – 均值算法涉及到集群，每个集群有自己的质心。一个集群内的质心和各数据点之间距离的平方和形成了这个集群的平方值之和。同时，当所有集群的平方值之和加起来的时候，就组成了集群方案的平方值之和。

Python代码

#Import Libraryfrom sklearn.cluster import KMeans#Assumed you have, X (attributes) for training data set and x_test(attributes) of test_dataset# Create KNeighbors classifier object modelk_means = KMeans(n_clusters=3, random_state=0)# Train the model using the training sets and check scoremodel.fit(X)#Predict Outputpredicted= model.predict(x_test)R代码

library(cluster)fit 8、随机森林

Python

#Import Libraryfrom sklearn.ensemble import RandomForestClassifier#Assumed you have, X (predictor) and Y (target) for training data set and x_test(predictor) of test_dataset# Create Random Forest objectmodel= RandomForestClassifier()# Train the model using the training sets and check scoremodel.fit(X, y)#Predict Outputpredicted= model.predict(x_test)R代码

library(randomForest)x 9、降维算法

487 篇文章192 人订阅

0 条评论

## 相关文章

1353

7389

### Kaggle 数据挖掘比赛经验分享

Kaggle 于 2010 年创立，专注数据科学，机器学习竞赛的举办，是全球最大的数据科学社区和数据竞赛平台。笔者从 2013 年开始，陆续参加了多场 Kagg...

6099

4437

### 理论：SVD及扩展的矩阵分解方法

svd是现在比较常见的算法之一，也是数据挖掘工程师、算法工程师必备的技能之一，这边就来看一下svd的思想，svd的重写，svd的应用。 这边着重的看一下推荐算...

2653

### 【学习】十大数据挖掘算法及各自优势

1. C4.5 C4.5算法是机器学习算法中的一种分类决策树算法,其核心算法是ID3算法. C4.5算法继承了ID3算法的优点，并在以下几方面对ID3算法进行了...

3975

51611

1844

3517

### LeCun 推荐！50 行 PyTorch 代码搞定 GAN

【新智元导读】Ian Goodfellow 提出令人惊叹的 GAN 用于无人监督的学习，是真正AI的“心头好”。而 PyTorch 虽然出世不久，但已俘获不少开...

3627