Answering Question on Metadata Vs. Big Data and Sharing Helpful Resources

By Jeffrey Lewis posted 04-03-2014 22:58


Once again, I want to say you thank you so much for choosing to attend my session. You were a great audience and I am sorry about going over and not having a chance to answer everyone's questions. Below are answers to questions from the app as well as the sources I referenced:


Question 1)Do you recommend ALL content owners add their own metadata? Or a small group of super users for consistency's sake?


Answer:  It can vary from setting to setting.  In a mailroom setting that is often a location where users are already entering in their own metadata.  It is unrealistic to expect ALL content owners to add their own metadata because those in the c-level or other higher up content owners might find it below themselves to enter in their own metadata.  This a great place to leverage technology for entering in metadata.  Your content owners are the subject matter experts for the relevant metadata, so if they are creating documents to go into a record keeping system or for am ECM I’d recommend creating a form or cover sheet that can be OCR’ed that would capture the relevant metadata and have the metadata put in that way.  The reality is that metadata entry is not the sexiest task and a small group of super users that are hand selected for this task will most likely feel disengaged.


Question 2)What is veracity in relation to Big Data?


Answer: Veracity in big data refers to the noises, abnormalities and biases in data. Key question to ask, is data being stored and mined meaningful to problem being analyzed.  How confident are you in the quality of your data?  1 in 3 business leader don’t trust the quality of the information they use to make decisions.  In one survey 27% of respondents were uncertain of accuracy of their data.  In the U.S. economy 3.1 trillion dollars a year is wasted because of bad data quality.  Statistics are taken from

The Four V’s of Big Data by IBM Data Hub


Question 3)Is big data just about metadata or is it about full content analysis


Answer: You are correct and if I had more time I would have gone into this more in-depth in my presentation.  Metadata powers big data, but the reason for big data is deeper analysis on unstructured content as well as structured content that we couldn’t make sense of before or didn’t have access to before such as sensor data or information from social media.  This ties back into the quote from “Age of Context: Mobile, Sensors, Data and the Future of Privacy” by Robert Scoble and Shel Israel which I read.  The quote states:

The fact is, we have more data and more insight about the customer than ever before, and customers expect companies to use it. Now, companies cannot proceed with business as usual. They need to change and advance to meet the rising expectations of modern customers.


With all the information that we have, we must do analytics on it to create an intimate feeling with our customers.  With this


Here are some links to citations that I mentioned in my talk that you may want to dive into on your own time:


Are You a Data Hoarder by Jeff Bertolucci in Information Week

The Big Data Balancing Act: Too Much Yin and Not Enough Yang by Dave Jones 


Defining Big Data by Elizabeth Gardener in Information Management Recommends: Item to Item Collaborative Filtering


How Companies Like Amazon Use Big Data to Make you Love Them


Amazon: Using Big Data Analytics To Read Your Mind

Retailers Using Big Data: The Secret Behind Amazon and Nordstrom’s Success


How Amazon Uses Big Data to Prevent Warehouse Theft

Lastly, I want to leave with you a couple of quote from Alan Pelz-Sharpe's keynote which ties into why metadata must be a precusor to Big Data and the basis for it taking priority over big data:
"Don't chase fashions - focus on your customers and your products/services - more people and less technology  may by the way forward."
"Algoriths only tell us so much - winners invest in smart people to use, interact with and ask the right questions."

#AIIM14 #metadata #analytics #AIIM2014 #amazon #BigData