[Question] OCR and comparison - boards.ie
Boards.ie uses cookies. By continuing to browse this site you are agreeing to our use of cookies. Click here to find out more x
Post Reply  
Thread Tools Search this Thread
06-12-2010, 11:18   #1
Fearing Trent
TomTom's Avatar
Join Date: Jun 2001
Location: Uibh Fhaile
Posts: 3,251
HMod: Keystone
OCR and comparison

I am posting here in hopes that A) I have the right section and B) you can help me.

I am faced with an issue where I work where we get the status of 3rd party devices displayed on a screen. We get no info from it in numerical format and it is a closed source program. We need to create reports based on the status of these devices and in the past we just sat down and took a manual note of what device was in what state. The issue we are now faced with is they want the reports to be more detailed, so this manual task has to be done more often.

We are pushing for access to the back end that drives this system but to be honest that could take quite some time. What we do at the moment is run a program that takes screen captures at set intervals during the day so we can sit down at a quite time and compare them. This is now proving to be a little to time consuming.

What I was hoping to find was a program that could use OCR to identify each label and then export it into a spreadsheet with the corresponding flag or yes or no depending on the colour of the text. A programming friend of mine explained how this was possible and I know it would be myself but in order to get something passed to be installed on the network it has to be a commercial grade product with support of something of an open nature that can be examined internally if required.

So after all that waffle, can anyone make any suggestions?
TomTom is offline  
06-12-2010, 12:39   #2
Registered User
Join Date: Jul 2002
Location: Dublin
Posts: 401
You don't mention the platform, but if this is a Windows app for instance, you can extract details from other running application's windows programmatically. There's a pretty extensive API in Windows for accessing windows and the controls (child windows) in them, injecting or reading data, manipulating messages, etc. I've used that kind of thing before to insert input into the GUI of another program - eg, filling out text boxes (as if from user keyboard input) in another arbitrary app I didn't have any other access to. I was using it for a simple automation application (filling out a stored username/password to some arbitrary apps' login windows), which along with automated GUI testing is probably the main use for this kind of thing.

AFAIK it shouldn't be too difficult to do the opposite and extract details back out, though this does very much depend on how the app you're trying to talk to is built - eg, if it's not using normal Windows text, this probably won't work (though you'd still use these techniques as a source for your visual-based solution, I guess). Setup behaviour would be on a similar basis to the Spy++ tool that comes with Visual Studio - interrogate windows (based on the user's selection) to get the window handles of the controls you want to extract data from, and then programatically access them later on to gather your data.

Here's a few articles on the general topic that should help;

Google should turn up plenty more. Most of those articles discuss doing this in C++ Win32, though if you want to do it in .NET, P/Invoking works fine for most of those APIs in my experience.

I guess I'd use Spy++ (or roll your own) to interrogate this app's windows and try and figure out its structure, which should give you a better idea of how you could manipulate it.
NeverSayDie is offline  
06-12-2010, 13:33   #3
Registered User
Join Date: Oct 1999
Location: Galway
Posts: 2,630
or if the data you are interested in is always on the same location on the screen you could get the pixels at that location and figure out what they are and then record the data
amen is offline  
06-12-2010, 18:22   #4
Registered User
Freddio's Avatar
Join Date: Mar 2010
Location: the biggest city in the world -> Dublin every day
Posts: 394
If there was any sort of a backend (even text logs) which could be used, it would be far easier to develop than OCR which is never 100% accurate (in the printed format anyway)
Freddio is offline  
Post Reply

Quick Reply
Remove Text Formatting

Insert Image
Wrap [QUOTE] tags around selected text
Decrease Size
Increase Size
Please sign up or log in to join the discussion

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Share Tweet