Election Take-Away

By Daniel Antion posted 11-09-2012 08:37

Recommend

Even in my personal blog, I try to keep quiet about politics, religion and the other topics that usually seem to divide people. I don’t care what political party or candidate you support, and I would never want to challenge anyone’s belief system. There is one thing that I do worry about though, and that’s the degree to which people, on all sides of these issues, are working off incorrect information. Up until the election, this topic was all the rage with people pointing to fact-checking websites, fact-checking sites for those websites, and various versions of the “facts” on myriad issues. I usually find that convincing people that their “facts” are wrong is a fool’s journey, so it usually doesn’t become my concern. What is my concern? What happens if the information I am providing turns out to be incorrect?

When we build information management solutions, we worry about a lot of things; we worry about capacity, we worry about bandwidth, we worry about search and findability and, increasingly we worry about usability. Today, I am starting to worry about accuracy. Why the sudden interest in accuracy? Actually, wouldn’t accuracy always have been at the top of the list in information management? I think the answer to the last question is “yes and no”, and I think understanding why it has been that way serves to answer the first question.

Separate and Accurate – When there was a sharp division between structured and unstructured data, accuracy was easy. Structured data is manipulated accurately by algorithms that are well understood and tested. If the algorithms need to change, the tests change and we march forward in lock-step. Unstructured data is what it is. A document’s contents might be accurate or inaccurate, but the document itself is just a container. Two things are happening that are changing this simple dynamic. One, we are working to combine structured and unstructured data. For example, we have a project on the drawing board that will let you see a policy document, the related reports about the insured (customer) that we have, and the premium and other numbers associated with that policy. Two, we are starting to derive structured data from the unstructured content. We have a number of projects where we are counting documents, measuring activity, charting progress; all based on the content and on metadata in a library. In each of these cases, we are relying on things beyond our control remaining accurate. Systems can change, and we could find ourselves hooked-up to the wrong data element. Procedures can change, and we could be counting activity that no longer holds the same meaning. In other words, there is now an additional degree of separation between the underlying data and the information we are presenting – that is always cause for concern.

But It’s On the Portal– There’s a TV commercial for an insurance company that has one character stating that “you can’t put anything on the Internet that isn’t true” a fact that she heard, on the Internet. Of course we all know that everything on the Internet isn’t true, but we expect the stuff on our company’s intranet to be true and we expect it to be accurate. This expectation, and the solid acceptance that the expected behavior will be delivered is what makes it so important to bridge those degrees of separation to maintain accuracy. We have to remain aware of the links that we establish, and we have to incorporate downstream usage in our change control process. It’s no longer good enough to fix our system, or to update our procedures; we have to know who uses our information and we have to keep them in the loop.

Blogs