我目前正在编写一个代码块,用于计算患者的一组样本的非参数引导。我已经编写了代码块,所以只有当我拒绝峰度测试的null时才会调用它(因为在这个阶段考虑更密集的探索性方法是不可行的)。下面是当前代码块的结构(作为一个方面,我不认为我可以提供我的实际数据-出于实际原因,我并不拥有它,所以我将尝试尽可能具体)
#create the list of subject ids
SUB_IDS=['values']
# this is a pandas data frame of all subjects' information. Columns are:
#Sub_ID, NG,C,ER
asbs=pd.DataFrame(['Values'])
for x in SUB_IDS:
x_data=asbs[asbs['SUB_ID']==x]
x_ng=x_data['NG']
x_c=x_data['C']
x_er=x_data['ER']
a,b = sp.stats.kurtosistest(x_NG)
c,d = sp.stats.kurtosistest(x_C)
e,f = sp.stats.kurtosistest(x_ER)
kurtosis_scores.append([x,a,b,c,d,e,f])
# for somplicity we'll only focus on bootstrapping one feature variable
if b <=.05:
mean=x_ng.mean()
else:
sampled_means=[]
for x in range(1,10000):
g=np.random.choice(x_NG,size=len(x_NG),replace=True)
print(g)
g=np.mean(g)
sampled_means.append(g)
我的代码运行得很流畅,直到最后一个块-当我想取采样均值的平均值,并将值附加到带有主题id的列表中时(以及在我计算这些值之后的其他自举取值的均值-为了可读性,我遗漏了这一位).Every当我在sampled_means上使用np.mean函数时,我得到0(这是有意义的,python在迭代之前会计算它)。
在我通过for循环更新值并将统计信息传递给数组之后,“冻结”数组的最佳方法是什么?谢谢!
发布于 2018-06-06 09:59:57
这应该是可行的:
#create the list of subject ids
SUB_IDS=['values']
# this is a pandas data frame of all subjects' information. Columns are:
#Sub_ID, NG,C,ER
asbs=pd.DataFrame(['Values'])
for x in SUB_IDS:
x_data=asbs[asbs['SUB_ID']==x]
x_ng=x_data['NG']
x_c=x_data['C']
x_er=x_data['ER']
a,b = sp.stats.kurtosistest(x_NG)
c,d = sp.stats.kurtosistest(x_C)
e,f = sp.stats.kurtosistest(x_ER)
kurtosis_scores.append([x,a,b,c,d,e,f])
# for somplicity we'll only focus on bootstrapping one feature variable
if b <=.05:
mean=x_ng.mean()
else:
sampled_means=[]
for x in range(1,10000):
g=np.random.choice(x_NG,size=len(x_NG),replace=True)
print(g)
g=np.mean(g)
sampled_means.append(g)
mean=sampled_means.mean()
https://stackoverflow.com/questions/50710485
复制相似问题