Published on

Calculating the Correlation Coefficient in Python

Authors

The other day I needed to calculate the correlation coefficient between two vectors in Python. It was for a program that searched for matching points by using correlation.

Normally, numpy.corrcoef is enough because it gives you the correlation matrix directly, but when I tried it, I ran into memory errors and could not continue the computation.

numpy.corrcoef looked as if it might be leaking memory, so I ended up writing it myself.

def corrcoef(x, y):
    mx = x - numpy.mean(x)
    my = y - numpy.mean(y)
    return numpy.dot(mx, my) / (numpy.sqrt(numpy.dot(mx, mx) * numpy.dot(my, my)))

The result above is equivalent to numpy.corrcoef(x, y)[0, 1].