Recognition technology has become the ‘soup de jour’ for companies, organizations and government agencies around the globe who want to automatically process information from forms, checks, mail and other important business documents with greater speed and accuracy.
The technology increases productivity by reducing the amount of manual intervention required with data entry and its associated costs.
Recognition software can be comprised of a variety of technologies – OCR software (Optical Character Recognition), which recognizes data that is in the form of machine print; ICR software (Intelligent Character Recognition), which recognizes handwriting, including handprint and cursive writing (though definitions and capabilities vary among vendors); among other image analysis and pattern recognition technologies. To get the most out of recognition software, those implementing and involved with it at organizations need to understand how the technology works in order to set appropriate expectations.
Following is an excerpt from the eBook: Understanding Recognition Technology-An Insider’s Guide, which explains the deeper aspects of how recognition software works, and includes examples, common misconceptions, best practices, and guidelines to get the most out of your implementation.
Here are six key factors to understanding the mechanics of recognition technology:
· Understanding the Recognition Process: Recognition technology approaches the process of reading information differently than humans. After all input images are fed to the software, the technology recognizes certain words and then divides them into two different streams – either accepted answers or rejected answers. Accepted images have a high probability of correct recognition, while rejects include words that cannot be accurately recognized. Having this basic understanding will help set expectations and business rules to optimize recognition performance for your organization.
· Understanding Errors and Rejects: There are three possible outcomes when recognition engines attempt to read any data in the steps that follow--the correct answer, error, or reject. It is important to understand errors and rejects and how to find the right balance between them. An error is reported when a recognition engine reports an incorrect result. Error rates can be significantly reduced by combining advanced automatic recognition technology with manual data validation. Rejects may be caused by the inability to process specific input or by the need to reduce the error rate to a level that the application can tolerate.
· The Importance of the Operating Point: There is a lot of confusion about how to judge the accuracy of recognition technology. This is where creating a metric, or operating point, comes in. The operating point is a critical number since it is used to build a business case, establish an ROI and set the benchmark to measure against. It is composed of two numbers: the read rate and error rate. In order to understand the importance of the operating point and how it is determined, it is first essential to understand “confidence value.” A confidence value is a certain number within a selected scale – for example, ranging from 0 to 100 – that indicates the reliability of the recognition answer. Let’s use an operating point of 85% read rate at 1% error rate as an example: This means that out of 100 documents, the recognition software will successfully read 85 and is likely to produce an error on 1. The other 15 documents would fall below the required accuracy and be passed to a human for review. In comparison, a human keyer / reviewer typically produces errors at a 3% rate.
· Common Misconceptions About Recognition: There are two common misconceptions about automated recognition technology: (1) that all items in a stream have an equal level of difficulty, and (2) that values should be recognized first and then applied against rules and context. Errors and inefficiencies caused by these misconceptions can be resolved with appropriate tuning. For example, the task of detecting CAR/LAR mismatch on a check can be solved more efficiently if the engine is specifically targeted to find a discrepancy between these two fields, rather than reading the contents of each field and then comparing the results from each item (CAR and LAR).
· Context and Business Rules Help to Get the Most Benefits From Data: Context and business rules are two tools that companies can use to process information faster and improve data quality. Context plays a significant role in the recognition process by helping to explain the characteristics or properties of data within a field, while business rules can be used to automatically populate fields with database lookup, and to ensure that the data meets defined criteria. These “magic bullets” perform some of the “thinking” or logic for organizations in recognizing data by determining how information is processed, increasing the accuracy of the data read by the software, speeding up the process and reducing the amount of required manual data entry. These rules allow more information to be recognized automatically during the “first pass” of processing, transfer the same context to those entering data manually (keying) during the second phase of data entry, and help to identify and verify low confidence fields during the final validation stage. Efficient and accurate processing enables companies to improve internal and external transactions, save time, improve customer service, increase collection rates, and reduce time spent researching problems due to incorrect data.
· Understanding Technology Limitations and Managing Expectations: The goal of every recognition engine is to produce the most accurate results possible. However, different recognition technologies often produce significantly different results. Advanced handwriting recognition technology, for instance, uses dynamic vocabularies to produce better results on poor quality images and hard-to-read characters, and therefore boosts accuracy in reading cursive and handprint information.
There’s no doubt organizations with complex, high-volume processing requirements reap the many benefits of automated recognition solutions. In order to take advantage of the full potential of this technology, businesses need to understand how it works. With better insight on what these recognition solutions do and how to use them, the software can be implemented and fine-tuned to provide better results faster and more accurately, which leads to a more efficient operation and a healthier bottom line.
For more insights, please read eBook: Understanding Recognition Technology-An Insider’s Guide.
Don Dew is Director of Marketing for Parascript, a leading recognition solutions provider, online at www.parascript.com.
#automation #recognitionsoftware #OCRsoftware #highvolumeprocessing #ICRsoftware #recognitiontechnology