Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Question on plotting data and line of best fit

  • 11-01-2010 6:26pm
    #1
    Registered Users, Registered Users 2 Posts: 394 ✭✭


    Hi all,

    Am feeling fairly slow today so if anyone can help...

    I am plotting two variables against each other on a standard scatterplot and then getting the line of best fit(and equation) so that I can predict one variable from the other.

    But I seem to be getting different answers depending on what variable I put on what axis. The R value is still the same and obviously the equation is different but when I sub one number into both equations I get different answers...

    Is that correct and maybe I should make a proper decision on which variable definitely goes on the x-axis?

    Thanks!


Comments

  • Registered Users, Registered Users 2 Posts: 872 ✭✭✭gerry87


    boarddotie wrote: »
    Hi all,

    Am feeling fairly slow today so if anyone can help...

    I am plotting two variables against each other on a standard scatterplot and then getting the line of best fit(and equation) so that I can predict one variable from the other.

    But I seem to be getting different answers depending on what variable I put on what axis. The R value is still the same and obviously the equation is different but when I sub one number into both equations I get different answers...

    Is that correct and maybe I should make a proper decision on which variable definitely goes on the x-axis?

    Thanks!

    The beta is the (covariance of the dependent and the independent)/(Variance of the independent)

    so by changing the axes your switching the dependent and the independent variables. So you change the variance in the beta.

    You can switch between the two by multiplying the ratio of the variances like, Beta(x)*[ Var(x)/Var(y) ]

    It's describing the same relationship but from a different perspective. Like a regression with car price and miles. One way says the price goes up say €120 per 1 miles, the other way would say 1 euro in price means .001 miles or whatever.

    The Y should be the dependent variable, so Y should depend on X, in the car/miles example, the miles clearly don't depend on the price, the price is set based on the miles, so the miles should be the X and the price the Y.

    Edit: fixed mistake delphi mentioned


  • Registered Users, Registered Users 2 Posts: 1,501 ✭✭✭Delphi91


    gerry87 wrote: »
    ...The X should be the dependent variable, so Y should depend on X...

    Incorrect - to plot data correctly, the variable put on the x-axis is always the independent variable (in my classes, I always refer to it as "the one you have control over or vary"). The dependent variable is then plotted on the y-axis.


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    You might still be wondering why you get different answers depending on which variable is considered the independent one. It might help your understanding of what's going on to think of it visually as follows:

    The line of best fit is the line that minimises the "sum of the squared residuals". The residual corresponding to a data point is the difference between the actual y-value and the y-value predicted by the line. Looking at a scatterplot, it's the vertical distance from the point to the line. So the best-fit line is the one that minimises the sum of squares of these distances.

    If you switch variables, it's like switching from the vertical distances to the horizontal distances. The line that minimizes the sum of squares of the y-residuals is not necessarily the same line as the one that minimises the sum of squares of the "x-residuals". This is why switching your independent and dependent variables is not just like switching x and y in the equation, and that's why it can change the estimates of one variable from the other.


  • Registered Users, Registered Users 2 Posts: 3,483 ✭✭✭Ostrom


    It always helped me to think in terms of response and explanatory variables, the response goes on the y, your dependent, and the explanatory on the x, your independent. It is useful when interpreting r-squared for different controls and asking 'how much of the variation of y does this particular variable explain'.


Advertisement