- Published on
Calculating the Correlation Coefficient in Python
- Authors

- Name
- Daisuke Kobayashi
- https://twitter.com
The other day I needed to calculate the correlation coefficient between two vectors in Python. It was for a program that searched for matching points by using correlation.
Normally, numpy.corrcoef is enough because it gives you the correlation matrix directly, but when I tried it, I ran into memory errors and could not continue the computation.
numpy.corrcoef looked as if it might be leaking memory, so I ended up writing it myself.
def corrcoef(x, y):
mx = x - numpy.mean(x)
my = y - numpy.mean(y)
return numpy.dot(mx, my) / (numpy.sqrt(numpy.dot(mx, mx) * numpy.dot(my, my)))
The result above is equivalent to numpy.corrcoef(x, y)[0, 1].