# 使用 Scikit-learn 理解随机森林

## 使用 treeinterpreter 分解随机森林

```from treeinterpreter import treeinterpreter as ti
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
import numpy as np

rf = RandomForestRegressor()
rf.fit(boston.data[:300], boston.target[:300])```

```instances = boston.data[[300, 309]]
print "Instance 0 prediction:", rf.predict(instances[0])
print "Instance 1 prediction:", rf.predict(instances[1])```

`prediction, bias, contributions = ti.predict(rf, instances)`

```for i in range(len(instances)):
print "Instance", i
print "Bias (trainset mean)", biases[i]
print "Feature contributions:"
for c, feature in sorted(zip(contributions[i],
boston.feature_names),
key=lambda x: -abs(x[0])):
print feature, round(c, 2)
print "-"*20```

```print prediction
print biases + np.sum(contributions, axis=1)```
```[ 30.76 22.41]
[ 30.76 22.41]```

## 比较两个数据集

```ds1 = boston.data[300:400]
ds2 = boston.data[400:]

print np.mean(rf.predict(ds1))
print np.mean(rf.predict(ds2))```
```22.1912
18.4773584906```

```prediction1, bias1, contributions1 = ti.predict(rf, ds1)
prediction2, bias2, contributions2 = ti.predict(rf, ds2)```

```totalc1 = np.mean(contributions1, axis=0)
totalc2 = np.mean(contributions2, axis=0)```

```print np.sum(totalc1 - totalc2)
print np.mean(prediction1) - np.mean(prediction2)

3.71384150943
3.71384150943```

```for c, feature in sorted(zip(totalc1 - totalc2,
boston.feature_names), reverse=True):
print feature, round(c, 2)```

## 分类树和森林

```from sklearn.ensemble import RandomForestClassifier

rf = RandomForestClassifier(max_depth = 4)
idx = range(len(iris.target))
np.random.shuffle(idx)

rf.fit(iris.data[idx][:100], iris.target[idx][:100])```

```instance = iris.data[idx][100:101]
print rf.predict_proba(instance)```

```prediction, bias, contributions = ti.predict(rf, instance)
print "Prediction", prediction
print "Bias (trainset prior)", bias
print "Feature contributions:"
for c, feature in zip(contributions[0],
iris.feature_names):
print feature, c```

1370 篇文章117 人订阅

0 条评论

4213

933

3788

1967

3875

3526

4216

6676

2043

3255