Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Statistics problem

  • 15-11-2012 9:38pm
    #1
    Registered Users, Registered Users 2 Posts: 3


    I can't understand how to solve this probelm and would appreciate it if someone helped me on how to approach it.

    1. The bombing of London during World War II was studied by statisticians

    as a Poisson random variable. One of the goals was to determine whether

    the Germans were bombing randomly or could target specic areas. Lon-

    don was divided into a grid consisting of 576 squares, each of area 0.25

    square kilometers, and the number of bombs that landed in each grid

    square was counted. The total number of bombs that fell was 538. The

    statisticians found that the number of grid squares on which exactly two

    bombs fell was 93. What is the expected number of grid squares on which

    exactly two bombs landed if the bombs were dropped at random over the

    grid?

    Thanks.


Comments

  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    This is what I'd do (but I'm a little rusty with the Poisson Dist so not 100% sure)
    You have 538 bombs and 576 grid areas, so your estimate for your "arrival rate"(λ) is 538/576, or 0.934 to 3 dp. (think of an average of 0.934 bombs per grid zone, as it were).

    What is the probability of getting 2 bombs in the one zone if your arrival rate is 0.934 and belongs to the Poisson dist [the probabilty of n bombs in one grid is (λ^n * exp(-λ))/n!]?

    Multiply this by the total number of zones to give a total expected number of zones with 2 bombs in it.

    Again, let me repeat that I'm a bit rusty on these problems but that is intuitively what I'd do here.


  • Registered Users, Registered Users 2 Posts: 1,163 ✭✭✭hivizman


    I would have done exactly the same calculation as Yakuza.

    This problem was discussed in a paper in the Journal of the Institute of Actuaries in 1946, and there's a summary of the article available if you click here.


  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    hivizman wrote: »
    I would have done exactly the same calculation as Yakuza.

    This problem was discussed in a paper in the Journal of the Institute of Actuaries in 1946, and there's a summary of the article available if you click here.

    Thanks for the link, interesting reading. That article reports 535 bombs instead of 538 in the OP, so there's a small difference in my lamda parameter to that in the article.

    As a recent returnee to the actuarial exams, it's a pleasant surprise to have retained something in the auld noggin after a gap of almost 20 years :)


  • Registered Users, Registered Users 2 Posts: 3 johnners


    Thanks for the help I get it now :)


  • Registered Users, Registered Users 2 Posts: 1 carlos799


    hi guys i've come across this exact problem and need help with some of the follow up questions.

    5. Consider a bomb which landed within the London grid square we have
    sampled. What is the probability that the bomb landed at least 0.2km
    from the edge of the grid square (assume that the bombs were, in fact,
    dropped entirely at random)?

    6.What is the expected distance that the bomb in the previous question
    landed from the edge of the grid square?

    how do you calculate the expected distance cus i hav no clue where to start :o

    thank you.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    If the grids are squares with sides of of 0.5km, then for a bomb to have landed at least 0.2km from the edge of a square, then it must have landed somewhere in a square of side 0.3km (with the same centre point as the outer square). Can you see a relationship between the ratio of the area of this inner square to that of the outer one and the probabilty of landing at least 0.2 km from the edge?

    Question 6 - Given that a bomb landed in this inner square (at least 0.2km from the edge of the outer grid), what is the average of a uniformly distributed variable that ranges from 0km to 0.3km? Then don't forget to add on the extra 0.2km to get the total.


  • Registered Users, Registered Users 2 Posts: 2 waddler


    I can't figure out 5 or 6 either, could you explain more about the ratio? I'm confused :/


  • Closed Accounts Posts: 3,479 ✭✭✭ChemHickey


    Yakuza wrote: »
    If the grids are squares with sides of of 0.5km, then for a bomb to have landed at least 0.2km from the edge of a square, then it must have landed somewhere in a square of side 0.3km (with the same centre point as the outer square). Can you see a relationship between the ratio of the area of this inner square to that of the outer one and the probabilty of landing at least 0.2 km from the edge?

    Question 6 - Given that a bomb landed in this inner square (at least 0.2km from the edge of the outer grid), what is the average of a uniformly distributed variable that ranges from 0km to 0.3km? Then don't forget to add on the extra 0.2km to get the total.

    Only a question/opinion- if the bomb has to land in a 0.2km distance, would it not be an inner square of 0.1km seeing as it would be 0.2km from either side of the inner square?


  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    ChemHickey wrote: »
    Only a question/opinion- if the bomb has to land in a 0.2km distance, would it not be an inner square of 0.1km seeing as it would be 0.2km from either side of the inner square?
    You're quite right, I was concentrating on one edge and not all of them :)

    Same principle applies, what are the odds of fallng in the area of 0.1x0.1km2 inside the larger area of 0.5x0.5km2?
    Q6's answer is 0.2km plus the average of a uniform dist. between 0 and 0.1km


  • Registered Users, Registered Users 2 Posts: 2 waddler


    So then is the probability just 0.01/0.25 ?


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 715 ✭✭✭Wesc.


    waddler wrote: »
    So then is the probability just 0.01/0.25 ?

    That's what I got Claire, yes... but I may be wrong :P


  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    1/25 or 4%, that's what I'd have said too


  • Registered Users, Registered Users 2 Posts: 1,163 ✭✭✭hivizman


    Yakuza wrote: »
    Q6's answer is 0.2km plus the average of a uniform dist. between 0 and 0.1km

    Isn't it 0.2km plus the average of a uniform distribution between 0 and 0.05km? Once we get past 0.05km, we are closer to the opposite side of the grid square.

    But isn't there a further complication? Question 6 talks of the edge of the grid square, but a square has four sides. Do we interpret "the edge" as a collective term referring to all four sides? That seems to be the consensus for question 5. But then "the distance" from "the edge" of any point in the square would have to be the shortest distance perpendicularly from the point to any of the four sides. More formally, if we express the point as (x,y), where x is the distance in km from the west side of the grid square in a due east direction and y is the distance in km from the south side of the grid square in a due north direction (basically, cartesian coordinates with (0,0) being the south-west corner of the grid square), then "the distance" is min(x,0.5-x,y,0.5-y).

    We can assume without loss of generality, because of the symmetry of the grid square, that (x,y) is in the south-west quadrant, in which case "the distance" simplifies to min(x,y). Ignoring for the moment the requirement that the bomb falls at least 0.2km from "the edge" of the grid square, x and y come from independent random variables X and Y both uniformly distributed on the interval [0,0.25]. We need to find the expected value of a random variable Z = min(X,Y). To work out the value of E(Z), I suspect that it's necessary to use calculus, and this would make question 6 much more difficult than it appears at first sight.

    Is there an alternative reading of question 6 that allows it to be solved using a straightforward as in question 5?


  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    hivizman wrote: »
    Isn't it 0.2km plus the average of a uniform distribution between 0 and 0.05km? Once we get past 0.05km, we are closer to the opposite side of the grid square.

    There I go, fixating on one edge again. You're quite right.

    hivizman wrote: »
    But isn't there a further complication? Question 6 talks of the edge of the grid square, but a square has four sides. Do we interpret "the edge" as a collective term referring to all four sides? That seems to be the consensus for question 5. But then "the distance" from "the edge" of any point in the square would have to be the shortest distance perpendicularly from the point to any of the four sides. More formally, if we express the point as (x,y), where x is the distance in km from the west side of the grid square in a due east direction and y is the distance in km from the south side of the grid square in a due north direction (basically, cartesian coordinates with (0,0) being the south-west corner of the grid square), then "the distance" is min(x,0.5-x,y,0.5-y).

    We can assume without loss of generality, because of the symmetry of the grid square, that (x,y) is in the south-west quadrant, in which case "the distance" simplifies to min(x,y). Ignoring for the moment the requirement that the bomb falls at least 0.2km from "the edge" of the grid square, x and y come from independent random variables X and Y both uniformly distributed on the interval [0,0.25]. We need to find the expected value of a random variable Z = min(X,Y). To work out the value of E(Z), I suspect that it's necessary to use calculus, and this would make question 6 much more difficult than it appears at first sight.

    Is there an alternative reading of question 6 that allows it to be solved using a straightforward as in question 5?

    I have a sneaking suspicion that these are first year college stats questions, that kind of detail is at odds with the level that the other questions are set at. An interesting question, though!


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    I agree with Hivizman that the expected distance calculation is a bit more complicated than previously suggested, but I think it can be dealt with slightly more easily.

    First, though, I query the interpretation that most posters have taken of question 6. I consider the phrase "the bomb in the previous question" to mean "a bomb which landed within the London grid square we have sampled", (rather than a bomb that has landed at least 0.2 km from the edge of the square).

    Anyway, the technique is the same whichever way you interpret it.

    If we take Hivizman's technique a step further and divide the square into eight triangular pieces by drawing diagonals as well as horizontal and vertical lines through the centre, symmetry considerations again allow us to treat one of these pieces only.

    Assume the grid square has co-ordinates (0,0), (0.5,0), (0.5,0.5), (0,0.5).

    I choose to deal with the triangular piece with vertices (0.0), (0.25, 0.25), (0, 0.25). The distance from any point in this region to the nearest edge of the grid square is its x-co-ordinate, so the required expected value is the integral of x over this region. This gives 1/6.

    You could get this answer without calculus if you were prepared to accept that the "average co-ordinates" of a point over a region is the centroid of the region, and combine that with the knowledge that the centroid of a triangle is a third of the way from the base to the vertex along a median, (or use the formula for the co-ordinates of the centroid of a triangle: ((x1+x2+x3)/3, (y1+y2+y3)/3).)


  • Registered Users, Registered Users 2 Posts: 1,163 ✭✭✭hivizman


    I choose to deal with the triangular piece with vertices (0,0), (0.25, 0.25), (0, 0.25). The distance from any point in this region to the nearest edge of the grid square is its x-co-ordinate, so the required expected value is the integral of x over this region. This gives 1/6.

    You could get this answer without calculus if you were prepared to accept that the "average co-ordinates" of a point over a region is the centroid of the region, and combine that with the knowledge that the centroid of a triangle is a third of the way from the base to the vertex along a median, (or use the formula for the co-ordinates of the centroid of a triangle: ((x1+x2+x3)/3, (y1+y2+y3)/3).)

    Do you mean 1/12 rather than 1/6? Intuitively, the centroid is going to be closer to the base than the vertex, which means that it's going to be less than 0.25/2 = 1/8. The way that the triangle's co-ordinates have been written, the triangle is the north-west half of the south-west quadrant of the grid square, so the "base" of the triangle is actually the y-axis. Hence the "distance from the edge" is given by the x co-ordinate of the centroid, and that's (0+0.25+0)/3 = 1/12.

    Doing some research on the internet, I found that the expected value of the random variable Z = MIN(X,Y), where X and Y are independent random variables both uniformly distributed on the interval [0,1], is 1/3, so taking the interval as [0,0.25], the expected value will be 0.25/3 = 1/12.

    The exploration took me to the mathematics of order statistics. Consider samples of n observations drawn from a random variable uniformly distributed on the interval [0,1]. Then the smallest observation n1 in the samples is distributed according to the beta distribution B(1,n) on the interval [0,1], and the mean of this is 1/(1+n).

    In the case of a "sample of one observation", you simply have the expected value of a uniform distribution on [0,1], which is 1/2. For the random variable Z = min(X,Y), you have a "sample of two observations", which thus has a mean of 1/3.

    Applying this to the interval [0,0.25], the expected value is thus 0.25/3 = 1/12.


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    Yes, of course - apologies. It suddenly ocurred to me while I was doing something completely different today that it should have been 1/12! I had mentally worked it out on a unit square, and forgot to scale it down when I switched to the square of side 0.5.

    Also, if you do it by integration, I forgot to refer to the fact that you've to divide by the area. If you do the double integral on the triangle I suggested, you get 1/(6 * 4^3). But the area of the triangle is 1/32, so this also gives the answer as 1/12.


  • Registered Users, Registered Users 2 Posts: 1,849 ✭✭✭764dak


    Q1. Can this be solved reasonably without programming? What’s the chance of getting a run of K or more successes (heads) in a row in N Bernoulli trials (coin flips)?
    Calculate it using N = 50 and K =10

    Q2. You are repeatedly flipping a fair coin. What is the expected number of flips until the first time that your previous 2012 flips are ‘HTHT...HT’?

    Q3. (Generalization) What is the expected number of coin flips for getting N consecutive heads, given N?

    Q4. Candidates are appearing for interview one after other. Probability of each candidate getting selected is 0.16. What is the expected number of candidates that you will need to interview to make sure that you select somebody?

    Q5. What is the expected number of bernaulli trials to ensure that there are atleast N successes, if the probability of each success is p?


  • Registered Users, Registered Users 2 Posts: 1,849 ✭✭✭764dak


    764dak wrote: »
    Q1. Can this be solved reasonably without programming? What’s the chance of getting a run of K or more successes (heads) in a row in N Bernoulli trials (coin flips)?
    Calculate it using N = 50 and K =10

    Q2. You are repeatedly flipping a fair coin. What is the expected number of flips until the first time that your previous 2012 flips are ‘HTHT...HT’?

    Q3. (Generalization) What is the expected number of coin flips for getting N consecutive heads, given N?

    Q4. Candidates are appearing for interview one after other. Probability of each candidate getting selected is 0.16. What is the expected number of candidates that you will need to interview to make sure that you select somebody?

    Q5. What is the expected number of bernaulli trials to ensure that there are atleast N successes, if the probability of each success is p?

    I know these questions are hard but wow.


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    Are you saying "wow" because you're surprised at not getting any replies?

    This forum is not for people to get their homework done. Sticking up a list of homework-like problems and giving no indication about what approaches you've tried so far, how far you got, etc., is not likely to get too many people around here on side.

    (It's also possible, of course, that nobody here knows how to do any of these, but my experience suggests that this is not very likely.)


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    I didn't even notice those last few posts, I thought this thread had run its course (and coupled with the facts they have nothing to do with the original problem and that they do seem to be lifted out of a textbook would leave me less inclined to answer them. They aren't that hard and there are also plenty of threads in this forum on bernoulli rial / binomial distribution type questions.


  • Registered Users, Registered Users 2 Posts: 1,849 ✭✭✭764dak


    Are you saying "wow" because you're surprised at not getting any replies?

    This forum is not for people to get their homework done. Sticking up a list of homework-like problems and giving no indication about what approaches you've tried so far, how far you got, etc., is not likely to get too many people around here on side.

    (It's also possible, of course, that nobody here knows how to do any of these, but my experience suggests that this is not very likely.)
    Yakuza wrote: »
    I didn't even notice those last few posts, I thought this thread had run its course (and coupled with the facts they have nothing to do with the original problem and that they do seem to be lifted out of a textbook would leave me less inclined to answer them. They aren't that hard and there are also plenty of threads in this forum on bernoulli rial / binomial distribution type questions.

    I apologize for posting these questions. Poster #1 and #6 posted questions without stating approaches so I assumed I could have done the same. These aren't homework problems either.

    The hardest one is number one.

    K is less than or equal to N.

    Let S(N,K) be the probability of getting K or more heads in n independent attempts.


    S(N,K) would be the sum of the probability of getting k heads at the beginning (p^K) plus the probability of that not occurring. If p^K doesn't happen that means at least on tail must occur before before K coin flips. Let's call the first tail j. The probability of successive K heads after j would be S(N-j,K). The probability of the first j occurring is (1-p)*p^(j-1). The probability of the first j occurring and K or more heads is S(N-j,K)*(1-p)*p^(j-1).

    The probability S(N,K) would look something like this:
    latex.php?latex=S%28N%2CK%29+%3D+p%5E%7BK%7D+%2B+%5Csum_%7Bj%3D1%2CK%7D+p%5E%7Bj-1%7D+%281-p%29+S%28N-j%2CK%29&bg=ffffff&fg=000&s=0


  • Registered Users, Registered Users 2 Posts: 1,163 ✭✭✭hivizman


    Your first question was: "Can this be solved reasonably without programming? What’s the chance of getting a run of K or more successes (heads) in a row in N Bernoulli trials (coin flips)?
    Calculate it using N = 50 and K =10"

    A similar question came up in a thread last month, and you may wish to look at this post for a discussion.

    Because the solution involves a recursive relationship, it is tedious and time-consuming to solve manually, so use of a spreadsheet package is advised.

    I haven't checked to see whether your formula for the required probability and mine are mathematically equivalent.


Advertisement