I have used OCR for some strange things, but this weekend I was posed with a new one. My Fiancée’s mom owns one of the local wineries. I often help with various administrative tasks, because as I’ve found, technology is not high on the priority list for most wineries. The task I was given was to come up with a new digital version of the wineries defacto back label. The original digital file has been sequestered by a printer, and a new one needed to be created for wine that was to be bottled in just a few days.
In California the words, and look of a wine label are very serious business. Each label has to be approved by the state, so you want to make as few changes as possible to ensure approval. However, the design of a wine label is also very important to people purchasing a bottle. For that reason, I wanted to be as true to what had already been approved, and I did not want to have to type the label in by hand I decided, I should OCR it. Here is what the label looked like after 600 DPI color scanning.
As you can see this is not an optimum image for OCR. The vertical lines are fairly obtrusive against the dark background. There is a stylized text at the beginning. The font and background are not significantly different. The fonts are small. And the dark background prevents me from OCRing anything without some planning. So an OCR challenge it was. This is how I approached it:
First I used the trick of inversion. I inverted the image so that the text stood out more. This trick can be used in a lot of scenarios. One of the coolest is using it to improve OCR results by doing a two pass read on a document one inverted, one not, and reconciling the results.
After inverting the image I played with some contrast so that the letters were more complete, despeckled (though it really did not need it), and straightened the lines to compensate for a slight vertical bend. Next was zoning. Below is how the OCR engine automatically zoned (document analysis) the document.
With experience, you find out how auto zoning can be either really good or bad. In this case, I knew that the automatic zoning would hinder the active pattern training during OCR (the second pass of OCR engines does after an untrained pass), and it would not give me a nice page layout after the fact by introducing some strange spacing. Instead of accepting the automatic zoning, as I knew it would not work well with the final export. I manually zoned with two zones, one for the title, and one for the body.
Doing this I could ensure better formatting on the body, and it would allow me to enable pattern training for the first zone to get that crazy font. And it worked. After about 5 minutes of work I was able to produce a word document where I had to make only 5 edits to have a 100% accurate digital representation of the back label.
There are a lot of crazy things I OCR, email, Flash and Silverlight screen captures, programming code snippets, even YouTube videos. Above and beyond the traditional scanning of paper documents and converting them to text, there are actually many other ways simple OCR technology can make you more efficient.
Now that the back label is converted it’s time to design a front.
#ScanningandCapture #OCR #uses