Advertisement
How to add spoiler tags, edit posts, add images etc. How to  a user's guide to the new version of Boards
Mods please check the Moderators Group for an important update on Mod tools. If you do not have access to the group, please PM Niamh. Thanks!
Help with statistical analysis

Hi guys,
I have to statistically analyze results from a quantitative survey. I have a numerical head but higher LC maths is as far as I went so any help would be greatly appreciated.
The majority of the questions are Likertscale. I've read that I can't treat them as interval so does this mean I must include standard deviation as well as the mean? Or something else altogether..?
Also, if anyone could help out with the concept of a ttest, I would be very grateful. Augh! I would have done qualitative research if I'd known about this!
Taco
Comments

The majority of the questions are Likertscale. I've read that I can't treat them as interval so does this mean I must include standard deviation as well as the mean? Or something else altogether..?
Strictly speaking, Likert scales are ordinal and not interval. That said, many people treat it as interval anyway. There is no globally accepted way to statistically analyze this kind of data. Expressing the data as means + SD is commonly done. This is wrong, but if you did that, you'd only be as wrong as everyone else. Simple graphs are always good too, either the means or show the distribution of answers with the Likert scale on the xaxis.Also, if anyone could help out with the concept of a ttest, I would be very grateful. Augh! I would have done qualitative research if I'd known about this!
A ttest is a relatively simple calculation that determines the probability of finding the observed difference in 2 means, if the null hypothesis (i.e. that there is no difference, only sampling error) is true. This probability can be used to accept or reject the null hypothesis with a certain level of confidence. To calculate this, it uses the means as a measure of central tendency and the SD as a measure of its variability.
There are 2 types of ttest. One is for unpaired samples. For example, you would use this if you had two different groups of people answering the same questionnaire (men and women) and were trying to see if their answers were different. A paired ttest is when the same people answer the same questionnaire item twice and you are trying to see if something has changed. You might also use this to compare the same people's scores on 2 items  although, regression/correlation would be a much more informative way to analyze that kind of data.
Unfortunately, the ttest assumes that the data are normally distributed, which is unlikely the case with Likert scale output. Luckily, there are nonparametric equivalents of the ttest that are just as easy to use and don't make that assumption (MannWhitney = unpaired; Wilcoxon signed rank = paired). These are probably the most correct way to proceed, but can usually only be done in proper statistical packages, which you may not have access to.
It might be easier to simply do whatever your supervisor says, even if it's wrong... :pac:
Mods: politely request that this be moved to the researcher forum?

Strictly speaking, Likert scales are ordinal and not interval. That said, many people treat it as interval anyway. There is no globally accepted way to statistically analyze this kind of data. Expressing the data as means + SD is commonly done. This is wrong, but if you did that, you'd only be as wrong as everyone else. Simple graphs are always good too, either the means or show the distribution of answers with the Likert scale on the xaxis.A ttest is a relatively simple calculation that determines the probability of finding the observed difference in 2 means, if the null hypothesis (i.e. that there is no difference, only sampling error) is true. This probability can be used to accept or reject the null hypothesis with a certain level of confidence. To calculate this, it uses the means as a measure of central tendency and the SD as a measure of its variability.
There are 2 types of ttest. One is for unpaired samples. For example, you would use this if you had two different groups of people answering the same questionnaire (men and women) and were trying to see if their answers were different. A paired ttest is when the same people answer the same questionnaire item twice and you are trying to see if something has changed. You might also use this to compare the same people's scores on 2 items  although, regression/correlation would be a much more informative way to analyze that kind of data.
Unfortunately, the ttest assumes that the data are normally distributed, which is unlikely the case with Likert scale output. Luckily, there are nonparametric equivalents of the ttest that are just as easy to use and don't make that assumption (MannWhitney = unpaired; Wilcoxon signed rank = paired). These are probably the most correct way to proceed, but can usually only be done in proper statistical packages, which you may not have access to.
It might be easier to simply do whatever your supervisor says, even if it's wrong... :pac:
Mods: politely request that this be moved to the researcher forum?
What does it say about my tutor when a person on Boards is more helpful?

Again, thanks so much for this reply. It's a great help. I'll take your advice and go with the standard ttest.
Don't forget Excel can do simple correlation/regression too, if you want to compare answers to different questionnaire items by the same people.What does it say about my tutor when a person on Boards is more helpful?
:pac:

Thanks 2Scoups. I've decided to go ahead and just do the ttest as I'm running quite short of time and, well, that's what my tutor says to do. Just 2 questions:
What happens when my t Stat is a negative?
and
Can I use the ttest to compare more than 2 groups? eg, I compared male/female but I've grouped them into 5 different income categories..so how does that work?
Also, what is the correlation/regression you mentioned above? Sorry but I'm pretty much starting from scratch here..
thanks in advance

i'm an undergrad but i thought i'd throw an oar in (I'm sure i'll be corrected ).
A neg ttest value tells about the relationship between the variables, you just have to interpret the results. Try putting in the variables the opposite way around. I'm sure you'll then get a positive result
Each Ttest carries a 5% error so multiple ttests will lead to analysis with very high levels of error! AFAIK an ANOVA is used as it doesn't carry the same kind of culminative analysis error. I doubt excel can do ANOVAs?
About the regression in excel i don't know as i've only used SPSS, but isn't the analysis provided by excel either graphical or very basic?
Why would you not use something like SPSS if you're doing various analyses on data? IMO It would make it much handier if you've used it previously to use what you learned by using it now!
I hope some of this helped
Edit: "so you're right that I don't have access to those statistical packages". Analysis could be tricky and considerably slower

Advertisement

i'm an undergrad but i thought i'd throw an oar in (I'm sure i'll be corrected ).
A neg ttest value tells about the relationship between the variables, you just have to interpret the results. Try putting in the variables the opposite way around. I'm sure you'll then get a positive result
Each Ttest carries a 5% error so multiple ttests will lead to analysis with very high levels of error! AFAIK an ANOVA is used as it doesn't carry the same kind of culminative analysis error. I doubt excel can do ANOVAs?
About the regression in excel i don't know as i've only used SPSS, but isn't the analysis provided by excel either graphical or very basic?
Why would you not use something like SPSS if you're doing various analyses on data? IMO It would make it much handier if you've used it previously to use what you learned by using it now!
I hope some of this helped
Edit: "so you're right that I don't have access to those statistical packages". Analysis could be tricky and considerably slower
Thanks Dango! I reversed the column order and got a positive value! It was identical but positive so from now on, if I get a negative value, I'll just make it a positive one.
I'm afraid the whole thing is due in 2 weeks (gulp) and my tutor advised against using SPSS so I'm stuck with excel. Thanks so much for your reply. Would you know if it's possible to carry out a ttest on more than one group?

You don't have to reverse the order just interpret the results but if it works and makes sense then you should be fine.
From wiki as it says it better than i could
"Oneway ANOVA is used to test for differences among two or more independent groups. Typically, however, the Oneway ANOVA is used to test for differences among at least three groups, since the twogroup case can be covered by a Ttest (Gossett, 1908)."
However if you wiki (how academic) under ttest and multivariate
"A generalization of Student's t statistic, called Hotelling's Tsquare statistic, allows for the testing of hypotheses on multiple (often correlated) measures within the same sample. For instance, a researcher might submit a number of subjects to a personality test consisting of multiple personality scales (e.g. the Big Five). Because measures of this type are usually highly correlated, it is not advisable to conduct separate univariate ttests to test hypotheses, as these would neglect the covariance among measures and inflate the chance of falsely rejecting at least one hypothesis (Type I error). In this case a single multivariate test is preferable for hypothesis testing. Hotelling's T 2 statistic follows a T 2 distribution. However, in practice the distribution is rarely used, and instead converted to an F distribution."
Now i haven't done anything like that but i think it pertains to what you want to do?
Sorry i couldn't be of more help! I'm a floundering student too!

Ah, so I use an ANOVA... Excellent
Er...please don't take offense but I am flailing around in the shallow end of the kiddie's pool of statistics and so am not even going to go near that Hotelling T 2 stuff..
*blub*...*gargle*... :pac:
I need to get to the BGRH bar to buy you all virtual pints in thanks.

What happens when my t Stat is a negative?
Looks like Dango dealt with this. It's not that important for what you need, which is where the tstatistic falls on the tdistribution  i.e. the significane of the difference.Can I use the ttest to compare more than 2 groups? eg, I compared male/female but I've grouped them into 5 different income categories..so how does that work?
The best way to do this is to use a 2(m/f) x 5(income category) ANOVA. Excel can't do this but you could probably do the calculations yourself with a calculator and some time... but it's asking a bit much with just 2 weeks to go. Also, if the ANOVA is significant, it eventually boils down to ttests again, anyway. You could try cutting out the middle man and just doing the multiple ttests. Your supervisor has a shown, IMO, a casual disregard for statistics so I doubt s/he'd have a problem this approach. :pac: The problem would be that multiple ttests inflate type I error (the chance of seeing a significant test, even when there is no real difference) to completely untenable levels. Correcting for this (with Bonferroni or some other multiple comparison test) is possible, but will limit your power if you have a small sample size.Also, what is the correlation/regression you mentioned above?
This a test that will show the relation between answers to different questions. For example, you might find out that people who answer positively to Q1 tend to answer positively to Q4; or people who answer positively to Q3 tend to answer negatively to Q7. If you have many items that are in a similar theme, it might be useful to present the fact that they elicit similar responses from people.

Your supervisor has a shown, IMO, a casual disregard for statistics so I doubt s/he'd have a problem this approach. :pac:
But before that, I literally could only apply gender and would have been left with two results!! The guy is an A+ muppet.This a test that will show the relation between answers to different questions. For example, you might find out that people who answer positively to Q1 tend to answer positively to Q4; or people who answer positively to Q3 tend to answer negatively to Q7. If you have many items that are in a similar theme, it might be useful to present the fact that they elicit similar responses from people.
Thanks guys  you are all being an incredible help to me.

Advertisement

As it stands, I have started doing ANOVA test, which are working out well, so thanks again to you Dango.
Well, remember that a significant ANOVA will still need to be broken down to find out where the difference is, so you'll end up using multiple comparisons anyway. Incidentally, are you analyzing for gender and income with separate ANOVAs in Excel?Ok that would be extremely handy. Using Excel, do I simply use the CORREL function?
Not sure about CORREL function  I would use the Data Anlysis menu again and use regression.

Well, remember that a significant ANOVA will still need to be broken down to find out where the difference is, so you'll end up using multiple comparisons anyway. Incidentally, are you analyzing for gender and income with separate ANOVAs in Excel?
I am analyzing the variables gender, income and education on two data sets with separate ANOVAs in Excel, yes..is that incorrect?Not sure about CORREL function  I would use the Data Anlysis menu again and use regression.

Yes, well I'm looking at the ANOVA and then putting the n, mean and standard deviation into a table so that the descriptive and inferential data combined gives me the result...is that correct?
The key statistic from the ANOVA will be the Fvalue and its significance. If it is significant, then you will have your answer for gender, since there are only two groups. However, if the 'income' ANOVA is significant, you won't know what is driving the difference until you breakdown by each of the 5 subgroups.I am analyzing the variables gender, income and education on two data sets with separate ANOVAs in Excel, yes..is that incorrect?
It's probably the best you can do given your situation, but you won't have the ability to detect interactions. For example, income might be a really predictive variable in women only, but the men in your groups could completely drown out the difference when you analyze them together, statistically speaking. Or maybe men and women are completely different in the lowest income group but the same everywhere else  you could potentially lose that information too.
Not incorrect... more of a limitation to what conclusions you can draw.

Yes, I see what you mean. To be honest, I put a lot of effort into the literature review, was really interested in the subject matter etc and am very disappointed to have been tripped up by the analysis. I was hoping to be able to analyse very throughly and do a good job. Oh well..
On the ANOVA  yes, I'm looking for the significance with the Fvalue and pvalue and then looking to the mean to find out where the significance is.
Whew, what a steep learning curve. I didn't even know what a pvalue was 2 weeks ago..

If you have the time, you could download the evaluation copy of SPSS from their website and work on your stats before the license expires. It's not that hard to use and we could help you with your analysis...
http://www.spss.com/downloads/Papers.cfm?ProductID=00035&Name=SPSS_Base&DLType=Demo


Hi guys,
Just to let you know I finished the analysis and am powering ahead in the discussion chapter. Thanks for both your! I really would have been stuck without it. I'm wondering what the protocol is for putting boards users in my acknowledgements page :pac:
Taco

Advertisement