Dark Data – Rediscovering This ‘Hidden’ Business Intelligence Treasure in Your Document Capture Process

By Don Dew posted 05-30-2014 17:54


In today’s vast digital world, organizations are scrambling to better control information to meet legal, regulatory, and business requirements. Adding to the challenge is an increasing number of information sources such as social media, mobile devices, and cloud storage. On top of this is a general increase in records, receipts, forms, email, and desktop documents overall, further amplifying the amount of information that is out there for companies to discover and manage. 

Within this malaise is the continuous need for businesses to better leverage data from traditional paper forms and other documents.   Organizations today need to leverage information to continuously improve operations, while simultaneously balancing the need to protect this important resource through better governance policies.  Capturing data (that is otherwise considered ‘dark’) on paper documents can help organizations route information more quickly, store and retrieve details more effectively, and identify risk points such as unauthorized transactions.

But what is ‘dark data’ to begin with? How can organizations identify it? And why should they care? A recent study from Parascript and AIIM, referenced below, begins to touch on these important questions.

First, what is dark data?  ‘Dark data’ is information that is not fully identified, captured and leveraged.  And while it is everywhere, it is especially prominent on paper documents such as forms, receipts, checks and applications, and may include notes and annotations, signatures and other handwritten and printed information.  It is referred to as “dark” because often this information is not captured or leveraged, or viewed as usable.

While some organizations consider ‘dark data’ ominous, this information can be extremely valuable to companies to leverage for information governance, legal discovery, enhanced customer service and other purposes. Whether it’s signatures, notes, or handwritten information on forms, one aspect that is consistent about dark data is that it is often not utilized to its full potential. This was one recent finding in the study, “Shedding light on the dark data in your document capture processes.”  Specific insights of the study from AIIM and Parascript include:

I.                   Organizations are missing out on capturing dark data, which could be valuable to their businesses.  This includes signatures and other handwritten information.  According to the survey:

·         40% of survey respondents said half or more of their inbound forms have handwritten data fields.

·         55% said signatures are on half or more of these documents.

·         The majority of respondents indicated that signatures are valuable, as a critical part of the information governance process:  86% of respondents in the study indicated that signature presence, validation and lookup would be useful for process enablement or discovery.

·         44% said they would find it extremely or very useful to recognize hand-written keywords on open-ended form fields on business documents, for use in tagging or metadata correction.

II.                Technology could offer a big advantage and, potentially, a huge payback to companies in accessing this untapped ‘dark data’ information reserve.While most companies recognize the presence and importance of this illusive, yet valuable, dark data, particularly when it comes to handwritten fields, many are still not leveraging it, and instead information is either being ignored, or manually entered, leading to errors or loss. This is a huge opportunity for companies to look to intelligent recognition technologies to capture and better leverage this data, particularly that found in both structured (field-based), and unstructured (freeform) handwriting.

The study finds that, in general the presence of handwriting on these forms is substantial, but adoption of handwriting recognition technology is low:

·         40% of survey respondents said half or more of their inbound forms have handwritten data fields.

·         6% of companies surveyed currently use ICR (Intelligent Character Recognition), meaning that most don’t even have the base technology that would provide access to unstructured handwritten data

·         35% of ICR users from the survey report a payback period of 12 months or less on ICR applications; and 55% see ROI within 18 months. 

·         Overall, respondents indicated an average productivity improvement of 31% was likely if recognition of hand-written text could be automated; 28% said they would expect a 50 % or more improvement.

What information is your organization missing out on leveraging?  Are there instances where you are capturing important information on forms and applications, such as preferences and feedback on your products or services, that your organization is never able to fully-leverage?  How could you benefit from having faster, easier, more searchable access to otherwise currently untapped information?  

As a society, we are only staring to tap into and fully leverage the power of dark data.

By harnessing advanced technologies to capture and capitalize on unstructured, untagged and untapped data, organizations could potentially unlock a gold mine of information.  Leveraging this dark data could boost business intelligence and open up opportunities to gain a competitive edge.

For more details on the findings, visit here.

Don Dew is with Parascript, online at www.parascript.com.



#darkdata #ScanningandCapture #BusinessIntelligence #documentcapture