Statistics - Regression

bedrock#1 · 15-12-2010 10:13PM #1

Hi all,

I'm studying for a stats exam on friday and i'm having real trouble getting my head around interpreting regression output from R.

I'm trying to do some sample questions from a past paper but getting flipping nowhere .... could somebody gimme a dig out!? Would really appreciate some help, cheers folks

R

Économiste Monétaire · 15-12-2010 11:39PM

What exactly are you having difficulty with? Your model is:

[latex]\displaystyle Price_{i} = \beta_{0} + \beta_{1} SQFT_{i} + \beta_{2}FEATS_{i} + \beta_{3}NE_{i} + \beta_{4}CUST_{i} + \epsilon_{i} [/latex]

"Residual Standard Error" is an estimate of the standard deviation of the error term, [latex]\displaystyle \epsilon [/latex]. When you run the above regression you get the estimated residuals 'e', the residual standard error is then calculated by

[latex]\displaystyle \sqrt{\frac{\sum_{i=1}^{N} e_{i}^{2}}{df}}[/latex]
where df is the degrees of freedom (n-k-1).

The r-squared number is the variance of the predicted values for price divided by the variance of price itself. It's a measure of how much variation is explained by your model. The F-stat is a test for all the beta's being equal to zero.

The 'estimate' column is what the model gives for the respective betas above. The t-values are from dividing the estimates by their standard errors.

bedrock#1 · 16-12-2010 12:04AM

Hey,

thanks for the quick reply, it's really just parts i, ii and iv i'm having trouble with.

I've attached the solution i think is correct for predicting the price of the second hand home in question ii.

I'm not doing a stats or maths degree, I'm doing a social science degree and they reckon we need an introduction to stats, multiple linear regression is as far as the course goes but with the crappy weather last week we didn't have our final tutorial so i didn't get a chance to go through this part of the course.

Really appreciate the help !

R

gerry87 · 16-12-2010 12:20AM

The regression shows the relationship between the independent variables and the dependent variable. The relationship is the beta.

So for a regression equation

Y = a + B1*X + B2*Z

The B1 shows the relationship between X and Y. You can say for a 1 unit change in X, you get a 1*B1 unit change in Y. For a 5 unit change in Z you get a 5*B2 unit change in Y. (ii is plugging in numbers)

The "Estimate" Figures on your paper are the betas.

To interpret results you want to look at the betas, a negative beta means a negative relationship - one goes up the other goes down. Is the figure high or low (keeping in mind the value of what you're looking at). i.e. 1 extra feature might have the same effect on price as 1000 unit increase in square footage, but that's not comparing like for like.

Are they useful? Check your p values and t-values, you'll have notes on that for checking at 5% significance or 10% etc.

Économiste Monétaire · 16-12-2010 12:20AM

You left out NE and CUST from your calculations, so add the estimates for these to your previous answer. For part one, I'd explain what each coefficient means; for example, what does the coefficient on SQFT mean? A one unit increase in square feet of living space, all else equal, should increase price by what amount? Remember that price is in hundreds of Dollars. Do this for all of the variables.

For part four, I would comment on the R-squared value being high but the individual coefficients for NE and FEATS are not significant at the 5% level; what does this imply for the functional form of the model? Should they be dropped? How would you test for this? Should Price actually be ln(Price)?

Statistics - Regression

Comments