The Complicated, but Essential, World of Capture Software

By Bryant Duhon posted 03-14-2011 23:43


AIIM and Harvey Spencer of Harvey Spencer Associates have been working on the creation of a Capture Software Evaluation Report, which (we are both EXTREMELY happy to say) is done. Harvey is tremendously knowledgeable and conversant with the myriad uses of capture technology, understands the market, and knows many of the vendors in the capture space. As I type this, I think Harvey was the first expert in the ECM industry that I ever met, when I was, egads, an editorial intern at AIIM. Harvey was plugged into capture then; he's even more in tune with the industry now. We had a quick chat about the soon-to-be-released report.

Q: Why is this report different?

HS: This is the first time that anyone has tried to profile all the vendors and their products in one document.

Q: Capture is so often misunderstood and thought of as simple. Why?

HS: People often look at capture as purely scanning – and scanning of fairly low volumes of documents.

Scanning is clearly part of capture, but it is often and increasingly the smallest part.  Capture also includes image processing, classification/understanding of the images, and data extraction with validation. These parts involve multiple technologies that have to be managed through networks and multiple servers. When users are processing thousands or tens of thousands of pages a day, these systems become quite complex.

Q: What are the divisions in capture software?

HS: 1. Ad-hoc image: The simplest systems rely on manual feeding of pages into scanners, MFP devices, or smart phones with manual indexing of fields to locate the document. These systems are used by the knowledge worker who wants to work on or collaborate on a document.  So they have to be simple and intuitive – they also have to work quickly.
2. Batch image: Mid-range systems scan and manage higher volumes of pages as batches of documents. These systems were started to optimize the different processes – the speed of today’s scanners, OCR, and manual indexing - separating out the functions as a replacement for inefficient departmental scanning and indexing solutions.
3. Batch transaction: Although there are a few freestanding systems based on scanners, many Batch Transaction systems consist of complex networked systems that scan and automatically extract data from images, e.g., Invoice Automation. These systems were developed out of single function intelligent OCR and OMR scanners which were used extensively to automate the processing of single forms types – such as tax returns, Medical Claims, Surveys, etc. The purpose was to extract validated data from a paper form, but has been extended into extracting and validating data from multiple different types of business document images including faxes and emailed images.
4. Image toolkits/OCR. We are also issuing a separate report covering image toolkits used by many vendors to create capture solutions. These toolkits include image processing, classification, forms removal and reconstruction, plus OCR, OMR and barcoide recognition

Q: Capture technology lies at the root of enterprise content management and has been around for 25 plus years – it’s a mature technology with an established business case. Why is the market so under-penetrated?

HS: Capture has been regarded solely as a front-end to ECM. Image capture is indeed a front-end to ECM.  But Transaction capture is not always.  We now have a blending of the technologies, which is expanding utilization and by understanding images at the point of capture, processes can be kicked off during the capture cyle – before traditional ECM gets involved

Q: What is the most important thing anyone considering the implementation of capture technology needs to know?

HS: Don’t just think of it as scanning – think of it as automated understanding and transformation of unstructured information.

The Capture Software Evaluation Report is available. Buy it here. 

#scanner #Capture #ECM #documentimaging #formsprocessing #distributedcapture #Scanning #EnterpriseContentManagement #ScanningandCapture #OCR