Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

C# - best way to approach problem

  • 22-06-2012 10:22am
    #1
    Registered Users, Registered Users 2 Posts: 3,615 ✭✭✭


    Hi,

    I'm developing an object recognition program and need some advice on how to best approach this. The problem is as follows: I have a glove hanging from the ceiling with fingers pointing downward about a meter in front of a relatively plain surface. What I want to do is locate the positions of the fingertips of the glove.


    I have a depth image of the scene so for every pixel in the image I have a corresponding depth value in millimeters.

    My plan was to loop through each pixel beginning at the bottom of the image, and when I encounter a significant change in depth, this represents a new object. I then have to determine whether this object is the glove or not.


    Here's where my head begins to melt:
    Assuming the first part of the glove that I would encounter is the tip of the middle finger (as it is the longest and fingers are pointing downwards), I would then;
    - check if depth is similar for pixels directly upwards (as I move up the middle finger).
    - check if depth is not similar say 2cm in either horizontal direction (gaps between fingers).
    - check if depth is similar as I move further in each horizontal direction (index and ring fingers).


    I'm not looking for someone to do this for me, but if someone could provide a good way to structure the code would be a huge help, because I'm unsure what would be the most efficient loops to do this with and how to best structure them. I'm a fairly novice programmer (good choice of thesis then :rolleyes:)

    Or any advice on better ways to approach this would be appreciated.


    Thanks


Comments

  • Registered Users, Registered Users 2 Posts: 7,157 ✭✭✭srsly78


    There are libraries built specifically for this: OpenCV is one -> http://en.wikipedia.org/wiki/OpenCV (lol at the example picture, it's exactly the same as your problem)
    Doing it yourself is indeed head melting as you have discovered.


  • Registered Users, Registered Users 2 Posts: 2,040 ✭✭✭Colonel Panic


    It's not much of a thesis project if he just hooks up a library to do all his dirty work.


  • Registered Users, Registered Users 2 Posts: 3,323 ✭✭✭padraig_f


    It sounds similar to OCR, so you could look at some open-source OCR libraries or OCR algorithms and see how they work. You'd get credit for that in the thesis as well, if you document the research and say how you used it, or adapted it to your own problem.

    Off the top of my head how I might tackle it is draw an imaginary line from the bottom of the image towards the top until you hit a certain depth (which represents the hand). Take the length of that line and store it in an array. Do that for every pixel along the bottom of the image. Then you have an array of the lengths of these lines, and you profile that array to match certain characteristics (i.e. it has 5 peaks for the fingers, and 4 valleys in between, for the gaps). You configure the profiling function with parameters to adjust how tolerant it is. You test with the image, and adjust the parameters, or add more parameters if necessary.

    e.g. array looks something like:
    [0,0,0,0,74,75,76,75,74,50,49,50,74,75,76,75,74.....0,0,0,0]

    where 75 (plus or minus 1) represents tips of the fingers, and 50 (plus or minus 1) represents the joints in between.

    The profiling function is still difficult, but what you have now is a 2-d graph so maybe you can use some mathematical technique that takes a 2-d graph and interprets the characteristics of the curve.


    Have a look at edge-detection image-processing algorithms as well. There are some relatively simple algorithms to do this (though I'm sure some complex ones as well), and it's in the same ballpark as what you're doing.


  • Registered Users, Registered Users 2 Posts: 7,157 ✭✭✭srsly78


    It's not much of a thesis project if he just hooks up a library to do all his dirty work.

    Nope but it's an open-source library with lots of documentation so he can take inspiration from it.


  • Registered Users, Registered Users 2 Posts: 3,615 ✭✭✭Mr.Plough


    padraig_f wrote: »
    It sounds similar to OCR, so you could look at some open-source OCR libraries or OCR algorithms and see how they work. You'd get credit for that in the thesis as well, if you document the research and say how you used it, or adapted it to your own problem.

    Off the top of my head how I might tackle it is draw an imaginary line from the bottom of the image towards the top until you hit a certain depth (which represents the hand). Take the length of that line and store it in an array. Do that for every pixel along the bottom of the image. Then you have an array of the lengths of these lines, and you profile that array to match certain characteristics (i.e. it has 5 peaks for the fingers, and 4 valleys in between, for the gaps). You configure the profiling function with parameters to adjust how tolerant it is. You test with the image, and adjust the parameters, or add more parameters if necessary.

    e.g. array looks something like:
    [0,0,0,0,74,75,76,75,74,50,49,50,74,75,76,75,74.....0,0,0,0]

    where 75 (plus or minus 1) represents tips of the fingers, and 50 (plus or minus 1) represents the joints in between.

    The profiling function is still difficult, but what you have now is a 2-d graph so maybe you can use some mathematical technique that takes a 2-d graph and interprets the characteristics of the curve.


    Have a look at edge-detection image-processing algorithms as well. There are some relatively simple algorithms to do this (though I'm sure some complex ones as well), and it's in the same ballpark as what you're doing.

    Interesting, I'll look into this. Even if I don't use this its good to have a variety of possible solutions to write about.

    In the shower today I was thinking of the following;

    Get the physical glove, and draw a template around it on paper, placing various points at different locations, making sure to place points between the fingers also. I then convert these point locations from mm to pixels.

    Then have one of the points at X,Y, and the rest at X + or - cx and Y + or - cy, where cx and cy are the distances in pixels of the other points from the XY point.

    Then starting at X = 0 and Y = 0, loop through pixels with something like

    if depth at glove points are all the same AND depth at gap points are significantly different from those at glove points

    break

    and you've found the glove. Essentially template matching but could be pretty versatile and work when there are other objects in the scene. I'm using the microsoft kinect sdk so I can easily transform from pixel to global coordinate systems using built in functions.

    Won't get near a computer until sunday so will update then when I no doubt run into problems!


  • Advertisement
Advertisement