How to QA documents After Scanning

By Mark Mandel posted 05-17-2010 16:42


Building on my last few posts about how to pick a scanner and how to index documents, this blog targets QA methods.

Quality Assurance processes have the objective of ensuring that:

  • All pages were scanned successfully
  • Image quality is acceptable
  • Images are in the correct order and rotation

The degree with which most organizations can perform QA is based on resources and trust in the process.  When scanning at low volume, the check to make sure each page was captured and that image quality is acceptable can be done interactively, viewing the pages in the viewer as they are scanned.

However, in most high volume environments this approach is not practical.  Stopping the scanner to rescan a document is not very efficient, and watching images fly by at 100 images per minute doesn’t really work.

So what options do you have for performing QA?

100% Image QA From Paper

In the most demanding environments, performing QA on each and every page may be required.  Using this technique, an operator views each page and compares it to the scanned image.  This is usually done in a batch processing environment where the scanned batches are routed to QA operators after scanning.

Operators view the paper side by side with the image, marking any that fail any QA criteria.  Errors can include misfeeds (where more than one page goes through the auto feeder, resulting in one or more pages that do not get captured), pages out of order, low quality image (too light or too dark), page folded over, and so on.

Some projects use this approach during a pilot phase or just for a few weeks at the beginning of a project, then scale back to review a smaller percentage as trust in the process builds up.

The ideal user interface for QA allows the user to mark bad images for rescan, rotate if necessary, move images around if they are not in the correct order, and sometimes run image enhancement software to darken or lighten an image (usually in dedicated forms processing applications, to be used carefully).

Rescanning should allow a user to replace bad images or insert new ones, ideally in the same position within a batch of documents.

These features are not found in every ECM solution.  They are definitely included in dedicated forms processing applications and in many high end capture products.

Statistical Sampling

Often users reduce the QA percentage, doing sampling rather than 100% QA.  Some products allow you to specify what percent you want to review, so you can adjust this from time to time easily.  If the product does not include this, you may have to devise a manual procedure or develop your own workflow application.  Outsourcing shops often price higher based on the level of QA required.

I have seen projects start with 100% during the pilot, then gradually decrease the percentage every few months, to 50%, to 20%, and eventually 5% in high volume situations.


A sometimes used technique is to print a date and sequence number at the top or side of each page during scanning (this of course requires a scanner that can do this).  After scanning, operators can quickly rifle through the pages to make sure all the pages have printing, and were therefore scanned successfully.  This is a technique often used for processing health claim forms where high accuracy is critical to the business process.


As you can see there are a number of issues to consider when figuring out how to QA your scanning process.  Consistently poor image quality may be addressed by cleaning the scanners regularly or by getting them serviced.  Consistent misfeeds will lead you to look at the scanners and the prep process. You should have a reporting system that allows you to see these trends so that they can be addressed systematically.