我在计算皮尔逊相关性。最后,我的结果(correlation1)如下所示。我想知道为什么在correlation1中所有的第二个系数都有0.0。有人能解释吗?此外,我的相关代码工作缓慢。我怎么能跑得快?
结果(抽样):
(0.52543523179249552, 0.0), (0.52543905756911169, 0.0), (0.52544196572206603, 0.0), (0.52545010637443945, 0.0)...
from scipy.stats import pearsonr
s1_list = []
s2_list = []
s3_list = []
s4_list = []
zip_list1 = []
zip_list2 = []
correlation1 = []
for x, y in zip(speed1_list, speed2_list):
zip1 = {"s1": float(x), "s2": float(y)}
s1_list.append(zip1["s1"])
s2_list.append(zip1["s2"])
zip_list1.append(zip1)
correlation1.append(pearsonr(s1_list,s2_list))
print correlation1
投入:
speed1_list:[113.0, 116.0, 120.0, 120.0, 117.0, 127.0, 124.0, 118.0, 124.0, 128.0, 128.0, 125.0, 112.0, 122.0, 125.0, 133.0, 128.0, 129.0, 126.0, 123.0, 120.0, 118.0, 114.0, 119.0, 129.0, 127.0, 128.0, 122.0, 120.0, 125.0, 119.0...]
speed2_list:[125.0, 123.0, 120.0, 115.0, 124.0, 120.0, 120.0, 119.0, 119.0, 122.0, 121.0, 116.0, 116.0, 119.0, 116.0, 113.0, 113.0, 115.0, 120.0, 122.0, 122.0, 113.0, 118.0, 121.0, 120.0, 119.0, 116.0...]
correlation1:(0.52543523179249552, 0.0), (0.52543905756911169, 0.0), (0.52544196572206603, 0.0), (0.52545010637443945, 0.0)...
发布于 2016-02-18 13:11:22
如果你阅读珠光体函数的文档化,你会发现第二个项是p值,它给出了数据集之间的皮尔森相关性等于0的概率。
如果我在示例列表上运行您的代码,我只得到一个0 p值:
correlation1 = [(nan, nan), (-1.0, 0.0), (-0.99946642948624609, 0.020797462218684917), (-0.87259228616792028, 0.12740771383207972), (-0.82714719627765909, 0.083995277603981247), (-0.58025386521762756, 0.22730335863992135), (-0.57868746304695651, 0.17345428063365897), (-0.53247171319158504, 0.17427615080621298), ...
但是我想您给出的correlation1
值来自列表中的更远的地方,您有足够的样本使您的相关性非常精确,因此p值为0。
https://stackoverflow.com/questions/35481424
复制相似问题