In the Enterprise Content Management (ECM) world content can exist in several places and multiple formats simultaneously. In this post I will focus on the aspect of Content Relevance. In subsequent posts I will tackle the challenge of keeping content synchronized and protected – with the rise of Cloud Computing both of these pose significant opportunities and challenges.
In the document capture (aka scanning) sub-segment of the ECM world content can take on many forms. Some content needs to be kept as an exact digital record of what is on a piece of paper (sometimes from both sides of the document). Other times the document can be discarded and only specific sections of the document are extracted for further use. When segments of the document or information about the document is captured that is what is commonly called Metadata.
Metadata can be as simple as the time and place the paper was scanned or may involve advanced image processing to read machine printed text (OCR), hand written text (ICR), bar code / QR code recognition, database lookups and much more. Depending upon the needs of the content for downstream efforts.
|
Definition of Metadata -
Metadata is loosely defined as data about data. |
|
NISO Guidelines – Understanding Metadata |
Note: I’m focused on the metadata that is associated with paper-based documents here. Metadata is also generated for photographs and videos. Metadata comes from many sources – including geo-location services, your automobile and your mobile phone. In the ECM world there may be a need to capture, manage, act upon and store this metadata. If there is enough interest I will tackle these points in follow up blog posts.
When is content really content?
It’s a little like the Supreme Court case of Jacobellis v. Ohio 378 U.S. 184 (1964), where Justice Potter was unwilling to describe pornography, but said “I know it when I see it.”
|
I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description ["hard-core pornography"]; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it, and the motion picture involved in this case is not that.” |
|
Justice Potter Stewart, concurring opinion in Jacobellis v. Ohio 378 U.S. 184 (1964), |
When is Content Relevant?
-
Is Content Relevant when the customer says it’s relevant?
-
Is Content Relevant when the government says it’s relevant?
Answer: YES!
Content becomes relevant when the user or the process requiring it deems it to be relevant. Of course, there are organizational and governmental requirements that define what content should be kept and in most cases for how long it should be kept. Regulations often dictate when and for how long content is relevant.
Is Metadata Content?
Yes. Metadata is indeed content. Metadata can take on many forms and in some cases may be only transitory and time-based. For example, in a workflow process metadata may be created that has a functional life of seconds or minutes. This does not make it any less valuable or critical. In a workflow process that one piece of metadata may need to be brought back to life (rehydrated) in the event of a process rollback.
The Relevance of Content
It has been said that Content is King. To which I agree, but I add the caveat … in context. This is especially true with metadata. Content without Context is at best useless which makes it not very valuable to prying eyes (or search engines and hackers) or at worst can be construed and misconstrued to create an avalanche of circumstantial and misleading information.
Pillar One is the First Step
From a Document Capture (aka scanning) perspective Pillar One is the first point where paper meets scanner and it’s the first point where indexing needs to be properly defined to insure the right content is captured and where context can be set.
I believe that Content needs to be captured at the source and to insure the Context is also captured. While this is not always possible it should always be the goal. Pillar One is the First Step
What do you think?
-
When do you think Content is Relevant?
-
Is Metadata content?
-
Bonus Question - Does your company capture and manage “Social Media" content?
I’d like to hear your thoughts. Comment here or drop me a line at any of the contact points below.
image credit: coredotnet
About The Author:
I have spent the last 20 years working in various aspects of the ECM industry. I am currently with Kodak as a Director of Business Development. In my past I have spent time at Kofax, Microsoft, FileNet, K2, and at Captaris (which was acquired by Open Text). Prior to that I was a Unix VAR running my own company. Follow me on Twitter, check my blog, send email or find me on Facebook or LinkedIn.
#OCR #ScanningandCapture #QRCode #ICR #AIIM #Scanning #metadata #cloudcomputing #Barcode #documentimaging #content #ECM