Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Linear Regression

  • 19-01-2010 2:44pm
    #1
    Closed Accounts Posts: 62 ✭✭


    Hello all,

    With respect to linear regression, could anybody explain what is meant by the "sampling distribution of an estimator" and, particularly, how it can be used to compare estimators?
    Is this just the range of possible values an estimate could take and its associated values, and with respect to comparison, does the sampling distribution determine whether or not it is a "good" estimator to use in a regression model?


    Also, again with respect to linear regression, I was wondering could anybody give me a brief description (their functions, determination, etc.) of the following concepts:
    1. R^2 (R squared)
    2. Predicted R^2 (Predicted R squared)
    3. Adjusted R^2 (Adjusted R squared)
    I'm pretty sure that R^2 is just the coefficient of determination, and that - roughly speaking - it shows how much of the variation in "y" is attributable/explained by "x". Is this correct?
    If so, what are the predicted and adjusted R^2 concepts?

    Thanks in advance! ;)


Comments

  • Registered Users, Registered Users 2 Posts: 3,483 ✭✭✭Ostrom


    The sampling distribution of an estimator is identical for all conditional estimates of y for any value of x on the regression line. It is used to calculate the standard deviation of the conditional distribution of y for fitted x values, which is the root of the mean square error. It is reported with r squared to describe your line, and goodness of fit. The r squared is the square of the correlation coefficient, and is read as the percentage of variance (y in terms of x) that your model explains.


  • Registered Users, Registered Users 2 Posts: 3,483 ✭✭✭Ostrom


    On the adjusted rsquared; it is usually reported to add an extra bit of room for small samples. If you google the formula (sorry i cant add anything i'm on my phone) and throw both into an excel sheet, you can see the adjusted rsquared approaching rsquared as the sample size increases


  • Closed Accounts Posts: 62 ✭✭patriks


    Thanks a million for that, efla. It really is appreciated.

    All the best! ;)


  • Registered Users, Registered Users 2 Posts: 3,483 ✭✭✭Ostrom


    Also on the comparison of estimators: the standardized forms of linear equations express the slope as a standard deviation increase in y per standard deviation increase of x. Since the unstandardised regression coefficients are dependent on their original units, you need this to compare across models with different predictors. The identical conditional distributions of the estimators for any x values of a given model allow you to make these comparisons.


Advertisement