How accurate is your understanding of “accuracy”

By Chris Riley, ECMp, IOAp posted 09-07-2010 13:43


Your first job is to start using document recognition technology, your second, is to make it as accurate as possible.  Problem is, you very likely purchased technology, and assumed some level of accuracy, without even knowing.

 Even when document capture is a twinkle in an organization's eye, they are asking the questions “how accurate is it?”.  This question, although good, is very nearsighted. Organizations that lead with this question, and it persists throughout the exploration of technology to its final us, end up believing in things such as “engine voting” and buying technology on an accuracy promise.  However, it’s just not that simple.

Even in its simplest form, full-page OCR, “accuracy” is not a catch all.  First off, is the form of calculation, does the percentage your given represent truth-data matched accuracy, % of characters the engine believes it got right, or is it the total percent of characters minus the % that are uncertain.  They are all very different calculations.  Next you have the depth of accuracy.  Is it on a sentence, word, or character level?  Add to the mix data capture, which now has separate accuracy calculation for template matching, field location, and finally data type.

So you can see that when throwing around the word accuracy without justification, a large mess can be created.  No vendor can rightly, even after seeing your documents, answer the questions of “how accurate are you”.  The only way to get an estimate of accuracy is to do a test run with a reasonably sized production level sample set, and corresponding truth data (manually reviewed 100% accurate result).

I prefer to look at the percentage of uncertain characters as the more broad measurement, and then detailed accuracy for final estimates.  This works well when considering only a single OCR/ICR engine, but when you are comparing two the story changes.

No engine reports accuracy at the same level.  I will give you a simple, but real example.  On a piece of paper is the word Cat.  Engine “A” recognizes the word as Eat.  Engine “A” reports its character confidence as 72%, 98%, 93% respectively.  Engine “B” recognizes the word as Cat. Engine “B” reports its character level confidence as 54%, 76%, 72% respectively.  If you were to do a simple comparison to see which engine’s results you like better, you clearly would pick Engine “A”, “eat”, and be very wrong!  This is why voting tends not to work.  While it’s theoretically possible the amount of research into mapping two engines differences would be substantial, and would have to be done every version.  However, voting can work, if you vote the same engine against itself.

I ramble.  The point is “accuracy” tends to be the decision maker when picking OCR technology, yet it’s rarely evaluated correctly.  You will see RFPs demanding an accuracy of 98% without giving samples, or purchases of data capture software where the character level accuracy is very high but characters never get a chance to be recognized because the template matching accuracy is very low. In any case, whether you read all the fancy numbers above or not, it’s clear that accuracy calculations, and conversations about accuracy are not as simple as throwing out a number.  You got to work at it.  If it seems that easy, then the suspicion flag should be raised proudly.

#ScanningandCapture #ICR #OCR #DocumentRecongition #accuracy