Statistics Help

aurthurg · 10-03-2008 06:02PM #1

Ok this is going to be pretty hard to say in words but I l have a go.

Basically

I have a set of data points in a xyz co-ord system and I am using interpolation techniques to predict an unknown z at a given xy

What I think is that the distribution of the xy data will affect the accuracy of the z calculated but i have no idea how to quantify that statisically.

To try and explain it better if say the unknown z at the known xy is the centre of say a square. In one square (a) we have 10 xyz points all located in the top left corner, say then in square (b) the 10 points are evenly distributed around the square. I think the z from square b will be more accurate then the z from a but is there a way of quantifying the distribution.

Probably best to note i hope to use excel to do this

10-03-2008 07:10PM

Not an expert but a normailty test would do this right?

jacque-bera / kruskall-wallace test for normality or perhaps a Kolmogorov-Smirnov Test run expost? The K-S test will identify any differnces between two sample groups.

(there is prob a million and one reasons why this is not viable..)

aurthurg · 12-03-2008 02:41PM

any other thoughts on this

spline · 12-03-2008 11:01PM

So we're looking at measurements of z at series of locations whose coordinates are given by x,y... You want the f that gives z=f(x,y), and you'd also like the variance of the estimate.

With a set of irregularly located zs and the desire to estimate some other zs at say, the mesh points of a regular grid, most would reach for the interpolation option in their GIS software. Inverse distance weighting gives a quick, cheap, and dirty estimate of the locally weighted mean. If you want a variance estimate, then a kriged estimate is probably what you need, but then you're into the murky world of semivariograms, sills, nugget variance and so on.

But... if you're using Excel, then it's probably IDW as the easiest solution. And given that you can compute the distance weighted mean of the observed zs, you should be able to generate the distance weighted variance without too much trouble, using the local mean. Kriging would be an interesting challenge in Excel.

Let us know what you do.

aurthurg · 13-03-2008 11:23PM

GIS is exactly what were using, the basis of the project is to create 3D soil maps for the new metro route. We are using kriging methods to interpolate the points but i have no idea about the different variables you mention were simply using the default ones

What were investigating (and at this stage were stuck for time) is the effect of the density and distribution of the boreholes have if we are trying to predict an unknown borehole at a known location. The density issue we have sorted, we are entering the data iteratively and see the effect on accuracy but distribution is troubling us

spline · 18-03-2008 11:16AM

One of the advantages of using kriging for interpolation is you can obtain the standard errors for your interpolated values. If you're using GeoStatistical Analyst in ArcGIS it's fairly easy (some hints and tips at http://www.ce.utexas.edu/prof/maidment/giswr2005/geostat/GeostatisticExercise.htm).

I usually use Edzer Pebesma's gstat package (available as a standalone application or a package for R - details at http://www.gstat.org/ and http://cran.r-project.org/web/packages/gstat/index.html). It's easy to specify the models in gstat, particularly if there's any spatial drift.

Statistics Help

Comments