Advertisement
We've partnered up with Nixers.com to offer a space where you can talk directly to Peter from Nixers.com and get an exclusive Boards.ie discount code for a free job listing. If you are recruiting or know anyone else who is please check out the forum here.
If you have a new account but can't post, please email Niamh on [email protected] for help to verify your email address. Thanks :)

Question of Covid trial statistics

  • 23-12-2020 1:47pm
    #1
    Registered Users Posts: 9,497 ✭✭✭ antiskeptic


    From the off let me say that I wouldn't take this, or any other, vaccine. Whether a vaccine is safe and effective require me to rely on the bona fides and competance of corporations and agencies with a spotted history in these regards. I'm not here to argue the merits or otherwise of that position.

    For the purposes of the question posed I am going to assume these numbers from the Pfizer trial. They may differ but the precise number is irrelevant to the question.


    24,000 people in the vaccinated group

    24,000 people in the placebo group

    9 subsequent Covid cases in the vaccinated group

    132 subsequent Covid cases in the placebo group.

    We don't assume anything at the start re vaccine effectiveness. For all we know at the outset of the trial, it could be as effective as the placebo at preventing Covid infection.

    We also assume that if exposed, a person will be infected. They might be asymptomatic or symptomatic (about 50% are asymptomatic I gather).

    I don't know if the trial went on to regularily test the participants to see if they were infected, or whether it was self reported (in which case the 9/132 represent half the actual numbers). We'll assume self reported.

    -

    The two groups are administered vaccine and placebo respectively and go forth into the world. A certain number in each group are exposed to Covid. Precisely how many, we don't know. It won't be an identical number of exposures for each group.

    My question is this:

    What is the possible spread in the number of people in a group of 24000 who could be exposed to the virus (and thus be infected)

    Take the placebo group, for example. We know 132 were exposed to Covid. But what would happen if a 2nd group of 24000 placebo'd people were sent out into the world. It clearly wouldn't result in precisely 132 contracting Covid. It would be some other number.

    If you carried out 100 x 24000 placebo'd trials you would get all sorts of numbers for those contracting Covid. There would be a spread of numbers. 132 could turn out to be the lowest number, perhaps 500 is the highest. And perhaps the bulk of the numbers sit around the 400 mark - making 400 a more representative figure.

    Conversely, 132 could be the highest number and the bulk of the figures settle down around the 14 mark, making 14 a more representativen figure.

    The same, of course, can be asked of the vaccine group.

    Now I understand you can't actually go and run a 100 x 24000 trial on each of 2 groups. Because of this, it seems that an assumption is made about the figures obtained from a single trial for each of the two groups. The assumption is that: the number of infected in each group is truly representative.

    How is this assumption established though. We know that you wouldn't for instance, get 132 every time. So how do we know that 132 is a fair representation of the bulk figure were you to trial 100 groups of 24000. Ditto the 9 figure in the vaccinated group.


    Remember, the actual numbers of people being exposed to the virus is only a tiny fraction of the 24000. Only a small number get infected afterall (if we assume all exposed become infected, save for an effective vaccine).

    If you had 100 groups of 24,000 unvaccinated, could the range of exposure each group experiences spread between 9 and 132? For if so, the Pfizer results say nothing.

    And if the spread could not be that wide, how do we know.

    Thanks for reading..


Comments

  • Registered Users Posts: 8,946 ✭✭✭ Ficheall


    I'm not sure your username fits.


  • Registered Users Posts: 13,396 ✭✭✭✭ CIARAN_BOYLE


    Here's a who article from 1988 about sample sizes and confidence intervals in vaccine trials. You may find it interesting.

    https://www.google.com/url?sa=t&source=web&rct=j&url=https://apps.who.int/iris/bitstream/handle/10665/264550/PMC2491112.pdf&ved=2ahUKEwjxyI60oeTtAhUso3EKHXw3DLgQFjAAegQIARAB&usg=AOvVaw3uqzqRyRpmRk1jTdKwxuwE

    It's important to note that by randomly selecting whether the people are placebo or vaccine trialists they are similar groups. They are therefore less likely to be wildly different.


  • Registered Users Posts: 9,497 ✭✭✭ antiskeptic


    Ficheall wrote: »
    I'm not sure your username fits.

    :)


  • Registered Users Posts: 4,581 ✭✭✭ LLMMLL


    This is basic statistics. You calculate a group size so that the probability of what you're measuring in that group differing from the total population is very small.

    Like if you were trying to ascertain the average height of men in Ireland and you randomly picked 1000 Irish men it's very unlikely that you'd picked 1000 people over 6ft2.


  • Registered Users Posts: 9,497 ✭✭✭ antiskeptic


    LLMMLL wrote: »
    This is basic statistics. You calculate a group size so that the probability of what you're measuring in that group differing from the total population is very small.

    Like if you were trying to ascertain the average height of men in Ireland and you randomly picked 1000 Irish men it's very unlikely that you'd picked 1000 people over 6ft2.

    Understood. But if you picked 2 groups of 1000 Irishmen (i.e. 2 groups of 24000) and decided you were going to see how many of them were red heads (a comparative rarity, just like infected cases are a comparative rarity in the vaccine trial), you would be unlikely to get an equal number of red heads in each group. You could easily get a big spread in your result

    Like a coin toss of 100, you won't get 50/50 if you repeat the exercise 300 times. Indeed, you might not even get 50/50 once. You'll get a spread of results.

    Given the tiny figures involved in the Pfizer trial: 9 and 132 infected from two groups of 24,000, the question is whether the natural spread of results (were you to repeat the exercise many times) could account for the difference. Rather than the vaccine.

    How does one calculate the natural spread that will be obtained at the outset - whether red heads or infected people. It doesn't matter how big the group is, you are going to get a spread and if tiny figures in your cohort of interest (e.g. red heads) then natural spread can have a huge effect.


  • Advertisement
  • Moderators, Sports Moderators Posts: 11,205 Mod ✭✭✭✭ hmmm


    the question is whether the natural spread of results .. could account for the difference.
    Of course not. Do you think this is the only time a medical trial has ever been run, and they are administered by dunces?

    Unless you're deliberately trying to cast doubt on vaccines, stop and think about what you are asking and how ludicrous a question it is.


  • Registered Users Posts: 4,581 ✭✭✭ LLMMLL


    Understood. But if you picked 2 groups of 1000 Irishmen (i.e. 2 groups of 24000) and decided you were going to see how many of them were red heads (a comparative rarity, just like infected cases are a comparative rarity in the vaccine trial), you would be unlikely to get an equal number of red heads in each group. You could easily get a big spread in your result

    Like a coin toss of 100, you won't get 50/50 if you repeat the exercise 300 times. Indeed, you might not even get 50/50 once. You'll get a spread of results.

    Given the tiny figures involved in the Pfizer trial: 9 and 132 infected from two groups of 24,000, the question is whether the natural spread of results (were you to repeat the exercise many times) could account for the difference. Rather than the vaccine.

    How does one calculate the natural spread that will be obtained at the outset - whether red heads or infected people. It doesn't matter how big the group is, you are going to get a spread and if tiny figures in your cohort of interest (e.g. red heads) then natural spread can have a huge effect.

    I don't think you understand. Natural spread has nothing to do with it. The question is the probability that the quantity measured in the trial group will differ to the entire population. Whether the quantity measured is high or low doesn't change that.

    Yoir coin toss analogy doesn't match up with what were talking about at all. In stats it's fully acknowledged there will be a spread of outcomes. That's why I said the PROBABILITY of the group differing from the total population not the CERTAINTY.


  • Registered Users Posts: 15,468 ✭✭✭✭ odyssey06


    Question asked and answered already on the vaccines thread... for those who are actually looking for an explanation:

    https://www.boards.ie/vbulletin/showpost.php?p=115684468&postcount=408


  • Registered Users Posts: 1,420 ✭✭✭ nudain


    Wait a minute, I just flipped a coin 10 times and I didn't get 5 heads and 5 tails. What kinda foul sorcery is this?


  • Administrators, Social & Fun Moderators, Sports Moderators Posts: 62,312 Admin ✭✭✭✭✭ Beasty


    nudain wrote: »
    Wait a minute, I just flipped a coin 10 times and I didn't get 5 heads and 5 tails. What kinda foul sorcery is this?

    Try flipping it it nine times and I can guarantee the results will not align with with your first example


  • Advertisement
  • Registered Users Posts: 1,414 ✭✭✭ PCeeeee


    Understood. But if you picked 2 groups of 1000 Irishmen (i.e. 2 groups of 24000) and decided you were going to see how many of them were red heads (a comparative rarity, just like infected cases are a comparative rarity in the vaccine trial), you would be unlikely to get an equal number of red heads in each group. You could easily get a big spread in your result

    Like a coin toss of 100, you won't get 50/50 if you repeat the exercise 300 times. Indeed, you might not even get 50/50 once. You'll get a spread of results.

    Given the tiny figures involved in the Pfizer trial: 9 and 132 infected from two groups of 24,000, the question is whether the natural spread of results (were you to repeat the exercise many times) could account for the difference. Rather than the vaccine.

    How does one calculate the natural spread that will be obtained at the outset - whether red heads or infected people. It doesn't matter how big the group is, you are going to get a spread and if tiny figures in your cohort of interest (e.g. red heads) then natural spread can have a huge effect.

    A statistic is an estimate of the population. No statistic can ever define the population.

    The whole science of statistics is based on this.

    Do you accept that?


  • Administrators, Social & Fun Moderators, Sports Moderators Posts: 62,312 Admin ✭✭✭✭✭ Beasty


    If you go back to school mathematics, normal distribution curves, standard deviations and the like, the statisticians can evaluate their results with a level of assurance - so they may be 95% satisfied the results will be within quite a narrow range. There is no "certainty" with stats, just different levels of assurance depending on the data and sample size


  • Registered Users Posts: 2,305 ✭✭✭ Bit cynical


    My statistics is a bit rusty but if we require 99.9% confidence then I think it is 132 +- 38 on the placebo sample and 9 +- 10 on the vaccine sample testing positive. Since there's no overlap it must be a good deal less than 0.1% that the outcome is the result of chance. Maybe someone else can double check?


  • Registered Users Posts: 9,497 ✭✭✭ antiskeptic


    LLMMLL wrote: »
    I don't think you understand. Natural spread has nothing to do with it. The question is the probability that the quantity measured in the trial group will differ to the entire population. Whether the quantity measured is high or low doesn't change that.

    Yoir coin toss analogy doesn't match up with what were talking about at all. In stats it's fully acknowledged there will be a spread of outcomes. That's why I said the PROBABILITY of the group differing from the total population not the CERTAINTY.

    Natural spread would have a lot to do with it is you got a range of 5-350 cases when carrying out a placebo trial on 1000 groups of 24000 people. And the plot showed the centr of gravity of cases was 243.

    In other words, how is 132 cases representative of an entire population, other than just assuming it is?


  • Registered Users Posts: 695 ✭✭✭ DaSilva


    Here man, teach yourself to fish instead of asking for a fish
    https://www.youtube.com/watch?v=KS6KEWaoOOE


  • Registered Users Posts: 4,581 ✭✭✭ LLMMLL


    Natural spread would have a lot to do with it is you got a range of 5-350 cases when carrying out a placebo trial on 1000 groups of 24000 people. And the plot showed the centr of gravity of cases was 243.

    In other words, how is 132 cases representative of an entire population, other than just assuming it is?

    Nope the natural spread is accounted for. The whole point of the most basic statistical tests is to account for natural spread. It's not a side issue that they forgot to address. It's a core issue that all statistics automatically addresses.

    The issue here is that you don't understand statistics.


  • Moderators, Science, Health & Environment Moderators Posts: 1,832 Mod ✭✭✭✭ Michael Collins


    Understood. But if you picked 2 groups of 1000 Irishmen (i.e. 2 groups of 24000) and decided you were going to see how many of them were red heads (a comparative rarity, just like infected cases are a comparative rarity in the vaccine trial), you would be unlikely to get an equal number of red heads in each group. You could easily get a big spread in your result

    It is true that you would be unlikely to get an equal number. Very unlikely, in fact. However, if both groups were chosen at random, and you waited until you found approximately 140 people with red hair, you'd be very surprised if over 90% of those 140 were from one group with only 10% from the other. Do you agree?
    Like a coin toss of 100, you won't get 50/50 if you repeat the exercise 300 times. Indeed, you might not even get 50/50 once. You'll get a spread of results.

    Very true, and in fact a somewhat common misconception in the field of probability that the numbers of each will eventually be the same as the number of trials increase. Of course, this is not true. Again, as in your last example, the chance of the numbers being equal is extremely unlikely.

    However, as the number of trials increases, if the coin is fair, you'd expect

    (number of heads)/(total number of flips) -> 1/2

    i.e. it should approach 1/2, but not necessarily equal 1/2.

    Now I have a question:

    Imagine you flipped a coin 141 times, and you ended up with 9 heads and 132 tails. What would you think?


  • Registered Users Posts: 9,497 ✭✭✭ antiskeptic


    It is true that you would be unlikely to get an equal number. Very unlikely, in fact. However, if both groups were chosen at random, and you waited until you found approximately 140 people with red hair, you'd be very surprised if over 90% of those 140 were from one group with only 10% from the other. Do you agree?

    Given the group size of 24000, my inclination is to suppose that 90/10 not as dramatic as it sounds.

    Say it was 1 and 9 in two groups of 2 million. Would you be impressed by the 90/10 split? I doubt it.

    Thus my question. How significant are these numbers given their relatively tiny size vs. the group size. Folk seem to simply assume it's significant and I ask how is that decided upon.






    Very true, and in fact a somewhat common misconception in the field of probability that the numbers of each will eventually be the same as the number of trials increase. Of course, this is not true. Again, as in your last example, the chance of the numbers being equal is extremely unlikely.

    However, as the number of trials increases, if the coin is fair, you'd expect

    (number of heads)/(total number of flips) -> 1/2

    i.e. it should approach 1/2, but not necessarily equal 1/2.

    Now I have a question:

    Imagine you flipped a coin 141 times, and you ended up with 9 heads and 132 tails. What would you think?

    I'd think the coin was bent. But we didn't flip the coin 141 times.

    My point with the coin toss picture was to illustrate the nature of the problem. We wouldn't expect the same figures in a 2nd trial of two groups of 24000. What would we get, other than just assuming we'd get something like we already got?


  • Registered Users Posts: 9,497 ✭✭✭ antiskeptic


    LLMMLL wrote: »
    Nope the natural spread is accounted for. The whole point of the most basic statistical tests is to account for natural spread. It's not a side issue that they forgot to address. It's a core issue that all statistics automatically addresses.

    Pressing a few buttons on my dash automatically makes the air temperature in the car 23C. But there's a bit more to automatic than waving a magic wand (and in the case of car heating controls I could explain how the automatic works)

    Could you explain how statistics automatically knows that 132 would be a fairly representative of the number of Covid cases expected in all populations of 24000. if you carried out a test on 1000 groups of 24000?

    You say account is taken. I ask how is account taken.


  • Registered Users Posts: 782 ✭✭✭ Doc07


    Given the group size of 24000, my inclination is to suppose that 90/10 not as dramatic as it sounds.

    Say it was 1 and 9 in two groups of 2 million. Would you be impressed by the 90/10 split? I doubt it.

    Thus my question. How significant are these numbers given their relatively tiny size vs. the group size. Folk seem to simply assume it's significant and I ask how is that decided upon.

    You are not comparing like with like ( possibly by innocent accident but more likely you are back for some more disingenuous sh!thousery)

    From your example above, no reasonable scientist or person would be too impressed with 1/2million versus 9/2million as the result is not statistically significant ie could be due to chance. Try it out yourself on any number of free stats calculators on google.

    Take the number of Covid infections in Pfizer trial for another example. 8/20thousand (vaccine) versus 162/20thousand is a highly statistically significant difference. Again try it yourself , takes about 10 seconds on a free stats website. The probability of the result being due to chance are so close to zero that no reasonable person with basic education would maintain it is spurious. Take the proportional difference which determined the efficacy value of 95%. The statistical plan allows for calculating that it is highly likely that the true value in the full population (not just the large trial sample population) would be between approx 90 and 97%.

    The statistical planning for these trials are pre-planned, pre-published and highly scrutinised so that if a difference is seen between groups it can be determined with very high levels of probability that they are not due to chance. Therefore, worrying about natural spread if you did it 200 times etc might sound like a clever or noble pursuit but it isn’t.


  • Advertisement
  • Registered Users Posts: 4,581 ✭✭✭ LLMMLL


    Pressing a few buttons on my dash automatically makes the air temperature in the car 23C. But there's a bit more to automatic than waving a magic wand (and in the case of car heating controls I could explain how the automatic works)

    Could you explain how statistics automatically knows that 132 would be a fairly representative of the number of Covid cases expected in all populations of 24000. if you carried out a test on 1000 groups of 24000?

    You say account is taken. I ask how is account taken.

    It doesn't have to know that 132 is representative. It doesn't claim certainty. Statistics is based on probability theory. They have chosen a group size such that the probability of measuring a value in that group that differs from what the value would be if you could measure the entire population is very low.

    So for example, assuming the average height often in Ireland is 5ft11, how likely would it be that if you took a random 1000 men their average height would be 6ft4. VERY unlikely. But as you say it's not certain. We could, with extremely low probability, happen to land on 1000 extremely tall guys and no short guys. It's just super unlikely.

    Trying to Prove this to you (someone with a very low knowledge of statistics) would take pages and pages of mathematics Which is unrealistic here. If you won't accept intuitive explanations and need the details then you'll need to go to a textbook and spend probably at least a month learning the mathatics behind all this. Are.you willing to do that?

    Textbook: Casella and Berger, Statistical Inference.


  • Registered Users Posts: 4,581 ✭✭✭ LLMMLL


    Given the group size of 24000, my inclination is to suppose that 90/10 not as dramatic as it sounds.

    Say it was 1 and 9 in two groups of 2 million. Would you be impressed by the 90/10 split? I doubt it.

    Thus my question. How significant are these numbers given their relatively tiny size vs. the group size. Folk seem to simply assume it's significant and I ask how is that decided upon.









    I'd think the coin was bent. But we didn't flip the coin 141 times.

    My point with the coin toss picture was to illustrate the nature of the problem. We wouldn't expect the same figures in a 2nd trial of two groups of 24000. What would we get, other than just assuming we'd get something like we already got?

    You'd expect to get a value within the confidence interval. You would be extremely unlikely to get 9.


  • Moderators, Science, Health & Environment Moderators Posts: 1,832 Mod ✭✭✭✭ Michael Collins


    Given the group size of 24000, my inclination is to suppose that 90/10 not as dramatic as it sounds.

    Good questions!

    The group size of 24,000 is of no significance for inferring the efficacy. What matters is the number of cases found, and from which half of the trial they fall in.
    Say it was 1 and 9 in two groups of 2 million. Would you be impressed by the 90/10 split? I doubt it.

    If the numbers were 1 from the vaccinated group and 9 from the placebo group, then these results would be less convincing, since there are fewer trials. Although, even this would strongly suggest the vaccine has a high efficacy. Going back to the coin analogy (which is a good one), how likely would it be to get 1 head and 9 tails from 10 flips?
    I'd think the coin was bent.

    Exactly, it's extremely suggestive that it's not a fair coin. So applying this to the vaccine results, what does it suggest for the vaccine efficacy?
    But we didn't flip the coin 141 times.

    But we did! That is what is implied in a trial such as the one you proposed. We have two halves of the trial, each consisting of an equal number people chosen at random. If the vaccine has no effect, we'd expect a 50% chance that someone testing positive is from either the vaccinated group or the placebo group.

    In your original post, you gave the numbers of 9 positives in the vaccinated group and 132 in the placebo group. Assuming the vaccine is useless, this is exactly analogous to flipping a fair coin 141 times and finding 9 heads and 132 tails, which, I think we all agree, is very unlikely. What's the conclusion for the vaccine efficacy (and the coin)?
    My point with the coin toss picture was to illustrate the nature of the problem. We wouldn't expect the same figures in a 2nd trial of two groups of 24000. What would we get, other than just assuming we'd get something like we already got?

    Given we have 141 cases (or coin flips, if you like) from your first trial, the variance can be shown to be quite narrow/small. In another trial, if you wait until you have at least a similar number of cases, the breakdown per group would be such that the chances of the vaccine having an efficacy of over 85% would be 99%.


Advertisement