Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Probability question (Binomial Distribution)

  • 25-10-2011 1:25pm
    #1
    Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭


    I have the following question, and I am unsure as to how to solve it.

    A man visits his doctor in a given month (at most once) whenever he is having heart pains. The chances of him having such pains in a given month are 0.2, and these pains do not affect in any way, pains occuring later. X = the number of visits he makes in a typical year. Find the mean and variance of X. In a given year, the first visit costs €75 and any subsequent visits cost €50. C = Total Doctors Bill for a given year, find Mean and Variance of C.

    I could do the first bit, noticing
    X~Binomial(n=12, p=0.2),
    therefore mean = np = 2.4 and variance = npq = 1.92 (q = 1-p).

    For the second bit, I defined C = 50X + 25Y
    (Y = 1 for at least one visit, Y=0 for no visits)
    Therefore Y ~ Bernoulli(p=0.2)

    So the mean of C = [latex]\mathbb{E}[50X] + \mathbb{E}[25Y][/latex]
    = 50(2.4) + 25(0.2), because the expected value of a Bernoulli RV is just p.

    The variance is what confuses me. X and Y are not independent so this is quite difficult.
    I said Var[C] = Var[50X + 25Y]
    = Var[50X] + Var[25Y] + 2Cov[50X, 25Y]
    = 2500Var[X] + 625Var[Y] + 2500Cov[X,Y] by properties of variances.

    How do I calculate the covariance of X and Y? I know it's [latex]\mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y][/latex] in the terms of expectations, but I'm not sure how to get E[XY]. Or have I just gone about the question wrongly?


Comments

  • Registered Users, Registered Users 2 Posts: 2,481 ✭✭✭Fremen


    I have the following question, and I am unsure as to how to solve it.

    A man visits his doctor in a given month (at most once) whenever he is having heart pains. The chances of him having such pains are 0.2, and these pains do not affect in any way, pains occuring later.

    0.2 in a given month? In a given day? I think the problem's ambiguous as stated.


  • Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭**Timbuk2**


    Fremen wrote: »
    I have the following question, and I am unsure as to how to solve it.

    A man visits his doctor in a given month (at most once) whenever he is having heart pains. The chances of him having such pains are 0.2, and these pains do not affect in any way, pains occuring later.

    0.2 in a given month? In a given day? I think the problem's ambiguous as stated.
    Hi, sorry, it does state that the probability of him having such a pain in a given month is 0.2, I left that bit out when typing it!


  • Registered Users, Registered Users 2 Posts: 2,481 ✭✭✭Fremen


    I think you've set C up in an awkward way. You could calculate E[XY] by summing over all possible values XY could take, weighted by the appropriate probabilities.

    However, it's better to observe that

    C = 0 if 0 visits

    C = 75 if 1 visit

    C = 75 + (X-1)*50 if > 1 visit.

    ...so how do you calculate the statistics of C?


  • Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭**Timbuk2**


    OK, that does seem like an easier way to set up C - I was only guessing as to how I'd calculate it.

    And at that, I seem to have made a mistake earlier.
    I said C = 50X + 25Y, and defined Y=1 for at least one visit. So I should have calculated the mean of C as follows
    E(C) = 50E(X) + 25E(Y)
    E(C) = 50(2.4) + 25(1-(0.8)^12) = 143.28,
    instead of saying that E(Y) = 0.2, which I did earlier.

    However, I am trying to calculate the mean using your improved way of writing the Cost RV C.

    C=0 if X=0
    C=75+(X-1)(50) if X >= 1 (this takes into account that C=75 for one visit).
    So E(C) = E[75+(X-1)(50)]
    = E[75-50 + 50X] (is this a valid step?)
    = E[25 + 50X]
    = 25 + 50(2.4) = 145

    Which is close to the answer I got above, but not equals. Can I multiply out 50(X-1), or is X-1 considered a random variable in its own right?


  • Registered Users, Registered Users 2 Posts: 2,481 ✭✭✭Fremen


    Which is close to the answer I got above, but not equals. Can I multiply out 50(X-1), or is X-1 considered a random variable in its own right?

    It's a random variable in its own right (but once you know X, you know X-1 and vice versa). RVs still obey the rules of arithmetic, so you can multiply it out like you did.


  • Advertisement
  • Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭**Timbuk2**


    Fremen wrote: »
    It's a random variable in its own right (but once you know X, you know X-1 and vice versa). RVs still obey the rules of arithmetic, so you can multiply it out like you did.

    OK, great, thanks!

    So for the variance, can I apply the same logic?

    Namely Var[C] = Var[75+(X-1)(50)]
    = Var[25 + 50X]
    = Var[50X]
    = 2500*Var[X]
    = 2500*(1.92) = 4,800

    Does that seem correct? It's on a revision sheet so I don't have the answer yet, but both the mean cost and the variance in the cost seem plausible!

    Thanks again!


  • Registered Users, Registered Users 2 Posts: 1,583 ✭✭✭alan4cult


    C = Total Bill for Year

    25 is like an initial setup cost and it's 50 per visit (including the first)

    If X = 0 then C = 0
    If X > 0 then C = 25 + 50X

    E(C | X = 0) = 0
    E(C | X > 0) = 25 + 50E(X | X > 0) = 25 + 50(2.577) = 153.85

    E(X | X > 0) = E(X) / P(X > 0) = 2.577

    Using law of total expectation:

    E(C)
    = E(C | X = 0)P(X = 0) + E(C | X > 0)P(X > 0)
    = 0*P(X = 0) + 145*P(X > 0)
    = 153.85*P(X > 0)
    = 143.28

    I'll post up the variance in a sec.


  • Registered Users, Registered Users 2 Posts: 1,583 ✭✭✭alan4cult


    E(C^2)

    E(C^2 | X = 0) = 0
    E(C^2 | X > 1) = E[(25 + 50X)^2 | X > 0]

    (25 + 50X)^2
    = 625 + 2500X + 2500X^2

    so

    E(C^2 | X > 1) = 625 + 2500*E(X | X > 0) + 2500*E(X^2 | X > 0)

    E(X | X > 0) = 2.577 (from last post)
    E(X^2 | X > 0)
    = [ Var(X) + [E(X)]^2 ] / P(X > 0)
    = [ 1.92 + 2.4^2 ] / 0.9313
    = 8.247

    so

    E(C^2 | X > 1) = 625 + 2500*E(X | X > 0) + 2500*E(X^2 | X > 0)
    = 625 + 2500*(2.577) + 2500*(8.247)
    = 27685

    E(C^2)
    = 27685 * P(X > 0)
    = 27685 * 0.9313
    = 25783

    Var(C)
    = E(C^2) - [E(C)]^2
    = 25783 - (143.28)^2
    = 5254

    I know there's a cleverer way to do this but I'm really tired so I wrote it out long hand. Also there is rounding error.

    The calculation in R gives 5252.31505432423


  • Moderators, Education Moderators, Motoring & Transport Moderators Posts: 7,396 Mod ✭✭✭✭**Timbuk2**


    Thanks for that, Alan! That's a really good way of doing it, but I'm not sure why it gives a different answer for variance than the Var[75+(X-1)(50)] = 4,800 I calculated above, using Fremen's prompt!

    Is it complicated to calculate expected values and variances of random variables in R? I have it downloaded to my laptop, and we do computer labs using R, but we haven't come across that (yet) - perhaps it is too complicated?

    Edit: I'll be getting the solution for this question during next week's tutorial, I think, I'll post it up then!


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    Your solution failed to fully take account of the fact that the rule for calculating C doesn't apply when X=0.

    I think that splitting it up the way alan4cult did is the only reasonable way to do it.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 1,583 ✭✭✭alan4cult


    Well you can do samples in R like runif (Random uniform), rbinom(Random binomial).

    Usually when there's a probability problem I can't do I run a simulation to see what the rough answer should be.


  • Registered Users, Registered Users 2 Posts: 2,481 ✭✭✭Fremen


    Your solution failed to fully take account of the fact that the rule for calculating C doesn't apply when X=0.

    I think that splitting it up the way alan4cult did is the only reasonable way to do it.

    It seems needlessly complicated to me. Surely you can just set up the expression for C in terms of X, then use the definition of expectation:

    E[C] = (sum over all n) n*P(C = n).


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    Fremen wrote: »
    It seems needlessly complicated to me. Surely you can just set up the expression for C in terms of X, then use the definition of expectation:

    E[C] = (sum over all n) n*P(C = n).

    Well that's essentially what alan4cult did. The problem is that the expression for C in terms of X can't easily be written as a single rule that covers all cases. In particular, the rule C = 25 + 50X (or equivalently C = 75 + 50(X - 1)) is not valid for X = 0. So, unless you can create a rule that works for all cases, you have to break it into pieces in order to, as you describe it, "sum over all n".

    Following your (correct) prompt, Timbuk2 correctly stated a correct composite rule for C, but then went on to ignore the X=0 part of that rule when calculating the expectation, whereas alan4cult took account of both cases (X=0 and X>0). That's what I was trying to point out.


Advertisement