I'm trying to get fully comfortable with sample v population in both variance & standard deviation.
I'm pulling data from a database, showing a list of nightly processes (around 200-300) and their execution times in minutes, going back the last 30 days. Some of the processes have a fairly consistent running time (i.e. to chart them results in a relatively level line), others vary wildly (where plotting them looks like a SIN curve)
I want to get the VARIANCE & STD_DEV to track which jobs are not consistent, so I can find out why and take action where necessary.
In pulling the data from the database, I am able to run a STD_DEV function, but there are two separate functions STDDEV_SAMP and STDDEV_POP.
Which should I use?
The confusing thing for me, is that the last 30 days of data - for the purpose of generating a "current" average is all I am interested in, i.e. to me it's the population, whereas in fact it is only a sample.....
To add further, some jobs have running times between 1-2 minutes and would results in a small VARIANCE & STD_DEV, others run between 60-80 minutes - so I need to level these out by making the VARIANCE as a % of the AVERAGE - right?