The Correlation Coefficient for Two Random Variables

Defines the correlation coefficient ρ(X,Y)=Cov(X,Y)/(σXσY)\rho(X,Y) = \operatorname{Cov}(X,Y) / (\sigma_X \sigma_Y) for two random variables, and shows how to compute it directly from variances and covariance, from raw moments, and how to recover the covariance from a given correlation.

Step 1 of 119%

Tutorial

Introduction

The correlation coefficient of two random variables XX and Y,Y, denoted ρ(X,Y)\rho(X,Y) or ρX,Y,\rho_{X,Y}, is defined as the covariance divided by the product of the standard deviations:

ρ(X,Y)=Cov(X,Y)σXσY\rho(X,Y) = \dfrac{\operatorname{Cov}(X,Y)}{\sigma_X \, \sigma_Y}

where σX=Var(X)\sigma_X = \sqrt{\operatorname{Var}(X)} and σY=Var(Y).\sigma_Y = \sqrt{\operatorname{Var}(Y)}.

The covariance tells us the direction of the linear association between XX and Y,Y, but its magnitude depends on the units of the two variables. Dividing by σXσY\sigma_X \sigma_Y removes those units and rescales the result so that it always lies in [1,1].[-1, 1]. Values close to ±1\pm 1 indicate a strong linear relationship, and values close to 00 indicate a weak one.

For example, if Cov(X,Y)=6,\operatorname{Cov}(X,Y) = 6, Var(X)=9,\operatorname{Var}(X) = 9, and Var(Y)=16,\operatorname{Var}(Y) = 16, then σX=3\sigma_X = 3 and σY=4,\sigma_Y = 4, so

ρ(X,Y)=634=12.\rho(X,Y) = \dfrac{6}{3 \cdot 4} = \dfrac{1}{2}.
navigate · Enter open · Esc close · ⌘K/Ctrl K toggle