Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

(Statistics) Calculating Correlation Coefficient

  • 04-03-2011 09:36AM
    #1
    Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭


    I'm a first year student doing Statistics as part of my degree at the moment.

    We were told that to calculate the correlation coefficient (r), we compute it as such
    [latex]r=\displaystyle\frac{SS_{xy}}{\sqrt{SS_{xx}}\sqrt{SS_{yy}}}[/latex]

    Which is equivalent to
    [latex]r=\displaystyle\frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^2\sum(y-\bar{y})^2}}[/latex]

    I understand it perfectly up until here, but we were told that it's easier to compute by hand if we write it in the following way.

    [latex]r=\displaystyle\frac{\sum{xy}-\frac{\sum{x}\sum{y}}{n}}{\sqrt{(\sum{x^2}-\frac{(\sum{x})^2}{n})(\sum{y^2}-\frac{(\sum{y})^2}{n})}}[/latex]

    And while this is considerably easier to compute by hand, I don't understand the equivalence.

    How does [latex]\sum(x-\bar{x})(y-\bar{y})[/latex] equate to [latex]\sum{xy}-\frac{\sum{x}\sum{y}}{n}[/latex], for example?

    Is it just an estimate (seeing as there is an n, the amount of pairs of data, included?) or is it an exact equivalence.

    Thanks


Comments

  • Registered Users, Registered Users 2 Posts: 2,481 ✭✭✭Fremen


    It's exact. You just need to go back to the definition of [latex]\bar{x}[/latex]

    [latex]\sum(X-\bar{X})(Y -\bar{Y}) = \sum(XY) - n\bar{X}\bar{Y} [/latex]

    (why is this true?)

    then,
    [latex]\bar{X}\bar{Y} = \left(\frac{\sum X}{n}\right) \left(\frac{\sum Y}{n}\right) [/latex],

    so

    [latex]n \bar{X}\bar{Y} = \frac{\sum X \sum Y}{n} [/latex].

    In the language of expectations,

    [latex] \mathbb{E}[(X -\mathbb{E}[X])(Y - \mathbb{E}[Y])] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y][/latex].

    This is true for all random variables.

    Once you understand how the numerator works, just set X=Y and the whole thing will follow.


  • Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭**Timbuk2**


    Thank you that makes complete sense :)


Advertisement