Probability of Occurrences

ShyWho12 · 03-12-2020 11:46AM #1

Hi!

I am wondering could someone help me with an issue I am having as I am new to probabilities. I have data that I calculated the correlations for and plotted the distribution as a histogram and getting a bell shape curve as expected and now I want to know the likelihood of those correlations occurring. I have calculated these values accounting for above and below the mean. I need to make a visual to represent these values but I am unsure what’s the best graph to represent these values. I have tried a histogram but because probabilities are positive this doesn’t seem to work as I do expect that bell shape curve again. When I look at my data the probabilities gradually increase to a point and then start to decrease but it’s when I try and plot the histogram it’s like once the peak is reached the probabilities fold back on each other because they are all positive values and it’s not a fair representation of the data.

Thank you!

Michael Collins · 11-12-2020 01:33PM

Hello ShyWho12,

It's not 100% clear what you mean.

You say you have a histogram already, is this not what you want? This would tell you the relative frequency of occurrence for each range on your x-axis. To get an estimate of the probability you could then divide each y-value on your histogram plot by the total number of occurrences -- this is one way of generating an experimental distribution, although there are other, more sophisticated, ways.

Yes, probabilities are always between 0 and 1 (inclusive), but that doesn't mean the histogram cannot go back down again. Remember, you're breaking your x-data up into discrete ranges, and seeing how many of the values fall into those ranges, this usually reaches a peak for mid-value ranges e.g. most people have a height between 5 ft 5 in and 5 ft 10 in, while either side of this there are fewer people who are smaller or taller -- so the graph will be lower for height ranges above and below this range.

Normally, there's no need to consider the mean separately, if your data are bell-shaped (or close to it), then the mean range is the range in the centre of the x-axis, where your histogram reaches its peak, i.e. you can see this visually from the plot. If your data are not bell-shaped (i.e. not "normally distributed"), then the mean is not so easy to see visually, but can be calculated and indicated on the graph.

It would help if you could upload an image of the various plots you mention.

Probability of Occurrences

Comments