# 预测随机机器学习算法实验的重复次数

## 1.生成数据

```from numpy.random import seed
from numpy.random import normal
from numpy import savetxt
# define underlying distribution of results
mean = 60
stev = 10
# generate samples from ideal distribution
seed(1)
results = normal(mean, stev, 1000)
# save to ASCII file
savetxt('results.csv', results)```

```...
6.160564991742511864e+01
5.879850024371251038e+01
6.385602292344325548e+01
6.718290735754342791e+01
7.291188902850875309e+01
5.883555851728335995e+01
3.722702003339634302e+01
5.930375460544870947e+01
6.353870426882840405e+01
5.813044983467250404e+01```

## 2.基础分析

```from pandas import DataFrame
from numpy import mean
from numpy import std
from matplotlib import pyplot
# descriptive stats
print(results.describe())
# box and whisker plot
results.boxplot()
pyplot.show()
# histogram
results.hist()
pyplot.show()```

```count  1000.000000
mean     60.388125
std       9.814950
min      29.462356
25%      53.998396
50%      60.412926
75%      67.039989
max      99.586027```

## 3.重复次数的影响

```from pandas import DataFrame
from numpy import mean
from matplotlib import pyplot
import numpy
values = results.values
# collect cumulative stats
means = list()
for i in range(1,len(values)+1):
data = values[0:i, 0]
mean_rmse = mean(data)
means.append(mean_rmse)
# line plot of cumulative values
pyplot.plot(means)
pyplot.show()```

```from pandas import DataFrame
from numpy import mean
from matplotlib import pyplot
import numpy
values = results.values
final_mean = mean(values)
# collect cumulative stats
means = list()
for i in range(1,501):
data = values[0:i, 0]
mean_rmse = mean(data)
means.append(mean_rmse)
# line plot of cumulative values
pyplot.plot(means)
pyplot.plot([final_mean for x in range(len(means))])
pyplot.show()```

## 4.计算标准误差

`standard_error = sample_standard_deviation / sqrt(number of repeats)`

```from pandas import read_csv
from numpy import std
from numpy import mean
from matplotlib import pyplot
from math import sqrt
values = results.values
# collect cumulative stats
std_errors = list()
for i in range(1,len(values)+1):
data = values[0:i, 0]
stderr = std(data) / sqrt(len(data))
std_errors.append(stderr)
# line plot of cumulative values
pyplot.plot(std_errors)
pyplot.show()```

```from pandas import read_csv
from numpy import std
from numpy import mean
from matplotlib import pyplot
from math import sqrt
values = results.values
# collect cumulative stats
std_errors = list()
for i in range(1,len(values)+1):
data = values[0:i, 0]
stderr = std(data) / sqrt(len(data))
std_errors.append(stderr)
# line plot of cumulative values
pyplot.plot(std_errors)
pyplot.plot([0.5 for x in range(len(std_errors))], color='red')
pyplot.plot([1 for x in range(len(std_errors))], color='red')
pyplot.show()```

`sample mean +/- (standard error * 1.96)`

```from pandas import read_csv
from numpy import std
from numpy import mean
from matplotlib import pyplot
from math import sqrt
values = results.values
# collect cumulative stats
means, confidence = list(), list()
n = len(values) + 1
for i in range(20,n):
data = values[0:i, 0]
mean_rmse = mean(data)
stderr = std(data) / sqrt(len(data))
conf = stderr * 1.96
means.append(mean_rmse)
confidence.append(conf)
# line plot of cumulative values
pyplot.errorbar(range(20, n), means, yerr=confidence)
pyplot.plot(range(20, n), [60 for x in range(len(means))], color='red')
pyplot.show()```

```from pandas import read_csv
from numpy import std
from numpy import mean
from matplotlib import pyplot
from math import sqrt
values = results.values
# collect cumulative stats
means, confidence = list(), list()
n = 200 + 1
for i in range(20,n):
data = values[0:i, 0]
mean_rmse = mean(data)
stderr = std(data) / sqrt(len(data))
conf = stderr * 1.96
means.append(mean_rmse)
confidence.append(conf)
# line plot of cumulative values
pyplot.errorbar(range(20, n), means, yerr=confidence)
pyplot.plot(range(20, n), [60 for x in range(len(means))], color='red')
pyplot.show()```

1653 篇文章85 人订阅

0 条评论

## 相关文章

1952

### 【干货】基于pytorch的CNN、LSTM神经网络模型调参小结

Demo 这是最近两个月来的一个小总结，实现的demo已经上传github，里面包含了CNN、LSTM、BiLSTM、GRU以及CNN与LSTM、BiLSTM...

9627

2989

1413

4.1K8

1195

3698

3446

4445

4139