Taming the Unstructured Data Beast with Object-Based Storage

By Megan Mohrmann posted 03-20-2017 18:21



In today’s world of ever-increasing big data, at least 80% (if not more) of it is unstructured. This data – which on a basic level includes any data (both textual and non-textual) lacking pre-defined or recognizable organization, structure or database containment – comes in many forms. From media files (such as audio, video, and photo), to website content, social media content, email and text messages, instant messages, PowerPoint presentations, Word documents, and beyond, we are surrounded by unstructured data. It permeates our daily lives, and most of our activities utilize or involve it to some extent. So, what should we do with all of it? How do we store it? Perhaps more importantly from a business perspective – how do we extract value from it?

As things currently stand with big data management, companies are turning to various solutions involving data mining, analytics, cloud computing and NLP (natural language processing), just to name a few, to get a handle on their big data. Each of these solutions has its inherent benefits and drawbacks, and all help in varying degrees to tackle the growing terabytes and petabytes of big data, both structured and unstructured, generated by a company. One solution companies are increasingly turning to in managing their unstructured data in particular, is object-based storage.

An emerging technological trend, object-based storage organizes data as individual objects on a flat plane, each with a unique identifier and attached metadata, as opposed to the traditional file system or block storage architecture, which manage it in vertical, hierarchical systems with metadata attached at the file level, or in blocks or volumes. The concept of a file system with directories and sub-directories is obliterated with object-based storage, with emphasis placed instead on the individual object and its unique identifiers. This system is a good fit for unstructured data, which is often difficult or impossible to designate or classify within a traditional storage structure. Instead, the unstructured data is lumped into objects, each with attached metadata and identifiers. This distinct structure provided by object-based storage allows the data objects to be stored in grid systems and modular units, capable of aggregating on multiple levels and across various locations, as opposed to file or block systems which are limited in cross-integration by organizational and structural constraints.

Aside from the cost-saving and inexpensive aspect of utilizing object-based storage (such as using online and cloud storage solutions), perhaps the greatest advantage of object-based storage in terms of unstructured data is the flexibility and versatility it provides in retrieving specific data. As each object has a unique global identifier (which can be thought of as an address or pinpoint location given to the object within the system), it can be easily recalled and retrieved as needed, regardless of whether or not one knows the actual physical location of the object or despite the vast amount of data or number of objects that might be in the database. This is often compared to fetching a car with a valet ticket.

Furthermore, object-based storage allows companies to keep up with the issue of constant growth. As unstructured data continues to be generated, an object-based storage system’s unique flat plane structure allows for additional objects to easily be added to the system, without having to restructure or deal with the storage limitations often experienced in a file or block storage system. This ease of expansion ultimately provides a cost and labor-effective way for the company to manage and scale its storage system to match its ever-increasing data.

In short, traditional storage systems have their merits and advantages, especially with respect to structured files and data. Object-based storage, however, provides structure, versatility and storage scalability to unstructured data. As companies ponder how to best manage their unstructured data, they might consider the advantages of using an object-based storage system.


About the Author: Since 2013, Jared Walker has acted as the curator extraordinaire for Zasio’s legal research database. Aside from expanding the database through new domestic and international research, he focuses on maintaining the currency, technical accuracy, and substantive relevancy of the existing citations. Armed with extensive experience in legal research and scholarship and a broad customer service background, Jared leverages his developed skills and unique perspective to provide a superior experience for our clients, by ensuring Zasio’s database of records retention law is unmatched in its breadth and accessibility.