Linear Regression

patriks · 19-01-2010 03:44PM #1

Hello all,

With respect to linear regression, could anybody explain what is meant by the "sampling distribution of an estimator" and, particularly, how it can be used to compare estimators?
Is this just the range of possible values an estimate could take and its associated values, and with respect to comparison, does the sampling distribution determine whether or not it is a "good" estimator to use in a regression model?

Also, again with respect to linear regression, I was wondering could anybody give me a brief description (their functions, determination, etc.) of the following concepts:
1. R^2 (R squared)
2. Predicted R^2 (Predicted R squared)
3. Adjusted R^2 (Adjusted R squared)
I'm pretty sure that R^2 is just the coefficient of determination, and that - roughly speaking - it shows how much of the variation in "y" is attributable/explained by "x". Is this correct?
If so, what are the predicted and adjusted R^2 concepts?

Thanks in advance!

Ostrom · 19-01-2010 05:19PM

The sampling distribution of an estimator is identical for all conditional estimates of y for any value of x on the regression line. It is used to calculate the standard deviation of the conditional distribution of y for fitted x values, which is the root of the mean square error. It is reported with r squared to describe your line, and goodness of fit. The r squared is the square of the correlation coefficient, and is read as the percentage of variance (y in terms of x) that your model explains.

Ostrom · 19-01-2010 07:20PM

On the adjusted rsquared; it is usually reported to add an extra bit of room for small samples. If you google the formula (sorry i cant add anything i'm on my phone) and throw both into an excel sheet, you can see the adjusted rsquared approaching rsquared as the sample size increases

patriks · 20-01-2010 10:46AM

Thanks a million for that, efla. It really is appreciated.

All the best!

Ostrom · 20-01-2010 05:16PM

Also on the comparison of estimators: the standardized forms of linear equations express the slope as a standard deviation increase in y per standard deviation increase of x. Since the unstandardised regression coefficients are dependent on their original units, you need this to compare across models with different predictors. The identical conditional distributions of the estimators for any x values of a given model allow you to make these comparisons.

Linear Regression

Comments