首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >在Python中计算皮尔逊相关性和重要性

在Python中计算皮尔逊相关性和重要性
EN

Stack Overflow用户
提问于 2010-10-16 22:15:28
回答 13查看 413.8K关注 0票数 211

我正在寻找一个函数,它接受两个列表作为输入,并返回Pearson correlation和相关性的重要性。

EN

回答 13

Stack Overflow用户

发布于 2010-10-16 22:29:57

你可以看看scipy.stats

代码语言:javascript
复制
from pydoc import help
from scipy.stats.stats import pearsonr
help(pearsonr)

>>>
Help on function pearsonr in module scipy.stats.stats:

pearsonr(x, y)
 Calculates a Pearson correlation coefficient and the p-value for testing
 non-correlation.

 The Pearson correlation coefficient measures the linear relationship
 between two datasets. Strictly speaking, Pearson's correlation requires
 that each dataset be normally distributed. Like other correlation
 coefficients, this one varies between -1 and +1 with 0 implying no
 correlation. Correlations of -1 or +1 imply an exact linear
 relationship. Positive correlations imply that as x increases, so does
 y. Negative correlations imply that as x increases, y decreases.

 The p-value roughly indicates the probability of an uncorrelated system
 producing datasets that have a Pearson correlation at least as extreme
 as the one computed from these datasets. The p-values are not entirely
 reliable but are probably reasonable for datasets larger than 500 or so.

 Parameters
 ----------
 x : 1D array
 y : 1D array the same length as x

 Returns
 -------
 (Pearson's correlation coefficient,
  2-tailed p-value)

 References
 ----------
 http://www.statsoft.com/textbook/glosp.html#Pearson%20Correlation
票数 206
EN

Stack Overflow用户

发布于 2013-04-16 08:17:58

皮尔逊相关性可以用numpy的corrcoef来计算。

代码语言:javascript
复制
import numpy
numpy.corrcoef(list1, list2)[0, 1]
票数 119
EN

Stack Overflow用户

发布于 2011-04-19 16:52:33

如果您不想安装scipy,我使用了这个快速技巧,对Programming Collective Intelligence稍作修改

代码语言:javascript
复制
def pearsonr(x, y):
  # Assume len(x) == len(y)
  n = len(x)
  sum_x = float(sum(x))
  sum_y = float(sum(y))
  sum_x_sq = sum(xi*xi for xi in x)
  sum_y_sq = sum(yi*yi for yi in y)
  psum = sum(xi*yi for xi, yi in zip(x, y))
  num = psum - (sum_x * sum_y/n)
  den = pow((sum_x_sq - pow(sum_x, 2) / n) * (sum_y_sq - pow(sum_y, 2) / n), 0.5)
  if den == 0: return 0
  return num / den
票数 38
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/3949226

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档