Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Probability Question

  • 19-04-2018 7:52am
    #1
    Registered Users, Registered Users 2 Posts: 4,231 ✭✭✭


    Hi All,

    I was typing a list of four names at work just now when I noticed that whether I alphabetised them by first name or surname the list remained the same, see example below:

    Aaron Aaronsson
    Benny Brown
    Catherine Caufield
    Dennis Dimbleby

    I then thought to myself, what are the odds of this happening? Which brings me here......?

    Cheers,

    Hercule


Comments

  • Moderators, Science, Health & Environment Moderators Posts: 1,852 Mod ✭✭✭✭Michael Collins


    Interesting question!

    One (particularly bad) way of calculating this would be to assume all of the following:

    i) the first and last names can start with any of the 26 letters in the alphabet
    ii) that each of the possible 26 starting letters are equally likely
    iii) that the starting letters for the first and last names are independent of each other

    In this case, the chance of any one name having the same starting letter for the first and last names would be 1/26.

    The chances of this happening 4 times would be

    (1/26)^4 = 1/456,976

    Again, this assumes that each of the names on the list are independent of each other (or it at least assumes that the chances of the starting letters for each first name being the same is independent of whether it has already happened or not).

    In reality, I would say that none of these assumptions are true!

    An accurate calculation of this would require some specific knowledge about the relative frequency of each starting letter, some analysis about the correlation between first names and last names (which would be language/race dependent), etc...


  • Registered Users, Registered Users 2 Posts: 5,141 ✭✭✭Yakuza


    If you could get your hands on a digital copy of the phone book, and import in the names of the entries there (I think it only has initials of the first name but that's all we care about), that'd give you a reasonable profile of initials and surnames in Ireland. As MC says above, it'd be fairly country dependent as letters that would be relatively rare here would be common elsewhere.

    Even better than the phone book (which is a bit old school, I'm not in it and haven't been for years) would be a subscriber list for a mobile company. Census data would be even better :)

    Anyhoo, once you got your data it would be relatively easy to work out the combinations of first initial and first letter of the surname and use that as a proxy for your probability table.

    I don't think it's something you can properly work out in an analytic sense as it's not really possible to define the entire set of names you're working with as it's constantly changing (births, deaths, marriages etc), but what I've touched on above could give you some likelihoods. (These are sort of pseudo-probabilities, based on observed data).


  • Registered Users, Registered Users 2 Posts: 1,595 ✭✭✭MathsManiac


    It seems to me that neither of the previous replies have answered the question actually posed, largely because the example offered by the OP was poorly chosen.

    Although the example given had each first name starting with the same letter as the surname, the question actually asked was this: what is the probability that alphabetizing a list of four names by first name and alphabetizing it by last name will yield the same ordering.

    This is a much easier question to answer than the one addressed in the two replies, which both sought to answer: "What is the probability that in four randomly selected names, (each being a first-name / last-name combination), the first name will start wit the same letter as the last name?"

    To answer the question posed, I think that, apart from the assumption that no two of the people share the same first name or last name, the only assumption required is that of independence of first-name alphabetical location and last-name alphabetical location. (That is, that the position of your first name in any alphabetical list of first names is independent of the position of your last name in any alphabetical list of last-names.) More plausible than MC's assumptions, for sure, but I think probably not true either. I think though, that it is probably close to being true or nearly true within a culturally homogeneous population.

    Anyway, with this assumption, the question is easy to answer: 1/24.
    There are a few ways to think of it. Here's one: replace each first name with the number 1, 2, 3 or 4 in accordance with its order in an alphabetical list of the four first names, and replace each surname with the letter A, B, C, or D in accordance with its order in an alphabetical list of the four surnames. If the numbers 1,2,3,4 are paired at random with the letters A, B, C and D, then the probability that 1 is paired with A, 2 with B, etc is (1/4)*(1/3)*(1/2) = 1/24.

    If the independence assumption is not true, then a positive correlation between the two would increase the probability a little and a negative one would decrease it.


Advertisement