Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

Multi-Variate Analysis help!

  • 08-03-2010 05:37PM
    #1
    Registered Users, Registered Users 2 Posts: 929 ✭✭✭


    I am very confused about this MVA course question, we understand how the linkages relate to the dissimilarity measures but don't know where to begin proving?

    Question:
    A dissimilarity measure d(x, y) for two data points x and y typically
    satisfy the following three properties:

    1. d(x, y) <= 0 and d(x, y) = 0 if and only if x = y
    2. d(x, y) = d(y, x)
    3. d(x, z)<= d(x, y) + d(y, z)

    The following have also been proposed as methods for measuring the dissimilarity between two
    sets of data points A = {xa1 , xa2 , . . . , xam} and B = {xb1 , xb2 , . . . , xbn}:
    • Single Linkage: d(A, B) = minx2A,y2Bd(x, y)
    • Complete Linkage: d(A, B) = maxx2A,y2Bd(x, y)
    • Average Linkage: d(A, B) = 1|A||B| Px2APy2B d(x, y)

    For each of the proposed linkage methods and dissimilarity properties, show that the linkage method satisfies that property or provide a counter example

    (a visual/diagram representation of any counter example is sufficient if appropriate).


Comments

  • Registered Users, Registered Users 2 Posts: 338 ✭✭ray giraffe


    I understand the question to be the following.

    If the sets A and B are regarded as two 'points', and we use a definition of linkage (e.g. 'single linkage'), are each of the properties 1,2,3 satisfied?

    E.G. Is it true that d(A,B) <= 0 ? Experiment if you are not sure.


  • Registered Users, Registered Users 2 Posts: 966 ✭✭✭equivariant


    Your notation is confusing. The properties of a dissimilarity measure seem clear (it's like a metric, except that the values are negative). However, I don't understand your notation for the examples.

    e.g. You say
    • Single Linkage: d(A, B) = minx2A,y2Bd(x, y)

    what do you mean by x2A? or by y2B? or what are x and y here? If I could understand this notation, maybe I could say something about how to prove/disprove that it satisfies the given properties.


  • Registered Users, Registered Users 2 Posts: 929 ✭✭✭sternn


    Sorry, the question I posted was a bit of a mess.

    • Single Linkage: d(A, B) = min (x an element of A),(y an element of B)d(x, y)
    • Complete Linkage: d(A, B) = max SIZE="1"](x an element of A)[/SIZE],(y an element of B)d(x, y)
    • Average Linkage: d(A, B) = 1/(|A||B|) Sum ((x an element of A) Sum((y an element of B) d(x, y)


  • Registered Users, Registered Users 2 Posts: 966 ✭✭✭equivariant


    sternn wrote: »
    Sorry, the question I posted was a bit of a mess.

    • Single Linkage: d(A, B) = min (x an element of A),(y an element of B)d(x, y)
    • Complete Linkage: d(A, B) = max SIZE="1"](x an element of A)[/SIZE],(y an element of B)d(x, y)
    • Average Linkage: d(A, B) = 1/(|A||B|) Sum ((x an element of A) Sum((y an element of B) d(x, y)

    OK, that makes more sense. Also, I suspect that in your original post, property 1. should be

    1. d(x, y) >= 0 and d(x, y) = 0 if and only if x = y

    and not 1. d(x, y) <= 0 and d(x, y) = 0 if and only if x = y as otherwise, nothing works.

    Assuming that is true, then the properties you have listed are those that are characteristic of a "metric". If you google for "metric space" you will find lots of info about these properties. For example http://en.wikipedia.org/wiki/Metric_space

    Back to your original question. In the case of the

    single linkage: this does not satisfy property 1. For example, consider the sets A={1,2} and B={2,3}. Clearly A and B are different, but d(A,B) = 0.
    It does have property 2 and it does not have property 3. Prop 2 is easy to see in this case. To see that it does have property 3, consider the following example. A = {1}. B={2,4}, C={5}. If you compute the quantities, d(A,C), d(A,B) and d(B,C), you will see that prop. 3 does not hold in this example.

    You can analyse the other examples in a similar way.

    Hopefully this helps


  • Registered Users, Registered Users 2 Posts: 338 ✭✭ray giraffe


    OK, that makes more sense. Also, I suspect that in your original post, property 1. should be

    1. d(x, y) >= 0 and d(x, y) = 0 if and only if x = y

    and not 1. d(x, y) <= 0 and d(x, y) = 0 if and only if x = y as otherwise, nothing works.

    You're right. The definition given by the OP implies that the space has at most one point, which is fairly pointless :D

    Concretely: if we have 2 distinct points x, y then d(x,y)<0, by rule 1, then putting z=x in rule 3 gives d(x,x)<=d(x,y)+d(y,x) and so by rule 2 we have 0<= 2d(x,y) , contradiction.


  • Advertisement
  • Moderators, Science, Health & Environment Moderators Posts: 1,855 Mod ✭✭✭✭Michael Collins


    You're right. The definition given by the OP implies that the space has at most one point, which is fairly pointless :D

    Nice.


Advertisement