Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

(Statistics) Calculating Correlation Coefficient

  • 04-03-2011 9:36am
    #1
    Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭


    I'm a first year student doing Statistics as part of my degree at the moment.

    We were told that to calculate the correlation coefficient (r), we compute it as such
    [latex]r=\displaystyle\frac{SS_{xy}}{\sqrt{SS_{xx}}\sqrt{SS_{yy}}}[/latex]

    Which is equivalent to
    [latex]r=\displaystyle\frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^2\sum(y-\bar{y})^2}}[/latex]

    I understand it perfectly up until here, but we were told that it's easier to compute by hand if we write it in the following way.

    [latex]r=\displaystyle\frac{\sum{xy}-\frac{\sum{x}\sum{y}}{n}}{\sqrt{(\sum{x^2}-\frac{(\sum{x})^2}{n})(\sum{y^2}-\frac{(\sum{y})^2}{n})}}[/latex]

    And while this is considerably easier to compute by hand, I don't understand the equivalence.

    How does [latex]\sum(x-\bar{x})(y-\bar{y})[/latex] equate to [latex]\sum{xy}-\frac{\sum{x}\sum{y}}{n}[/latex], for example?

    Is it just an estimate (seeing as there is an n, the amount of pairs of data, included?) or is it an exact equivalence.

    Thanks


Comments

  • Registered Users, Registered Users 2 Posts: 2,481 ✭✭✭Fremen


    It's exact. You just need to go back to the definition of [latex]\bar{x}[/latex]

    [latex]\sum(X-\bar{X})(Y -\bar{Y}) = \sum(XY) - n\bar{X}\bar{Y} [/latex]

    (why is this true?)

    then,
    [latex]\bar{X}\bar{Y} = \left(\frac{\sum X}{n}\right) \left(\frac{\sum Y}{n}\right) [/latex],

    so

    [latex]n \bar{X}\bar{Y} = \frac{\sum X \sum Y}{n} [/latex].

    In the language of expectations,

    [latex] \mathbb{E}[(X -\mathbb{E}[X])(Y - \mathbb{E}[Y])] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y][/latex].

    This is true for all random variables.

    Once you understand how the numerator works, just set X=Y and the whole thing will follow.


  • Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭**Timbuk2**


    Thank you that makes complete sense :)


Advertisement