The Email Pyramid

By Mark Mandel posted 10-17-2012 13:35


One of the key issues in Records Management is how to manage email. Email is the dominant content type requested by FOIA and eDiscovery because it often contains the “smoking gun,” and because people tend to be unguarded in their actions while using this medium for communication. Therefore email must be governed to reduce liability. Email also serves as key business records that document correspondence between parties, often as the only record for important decisions or acknowledgements. Email volume is growing exponentially and is a significant issue in most organizations.

There are many types of email solutions, but the prevailing approach today is one that is a component of an Enterprise Information Management (EIM) Suite. This EIM approach provides Information Governance for all content, regardless of media, managed by an integrated Records Management solution. The solution applies governance and retention rules to all content, from the time it is created or ingested until its final disposition. This approach is called Content Lifecycle Management.

As email volume continues to grow exponentially, organizations face several issues:

·         Optimization of the production email system

·         Capture of some email as important business records

·         Retention of email according to the organization’s retention policy

·         Deletion of transitory or non-record email

An EIM system, using the latest integrated archiving and auto-classification tools, solves this dilemma in a practical way. This approach involves a concept called the Email Pyramid. I initially learned about an early form of this approach from Benjamin Wright, an attorney who specializes in this area, at an ARMA conference several years ago. I also would like to acknowledge the contribution of Jason Baron, Director of Litigation for NARA, who also promotes a similar approach to email for classification purposes. I have added some ideas to the concept, including a pragmatic approach for implementing email governance in the real world.

The Email Pyramid

The general Email Pyramid concept is that there are three components of email in an organization. As shown in Figure 1, there are Role Based Classification, Business Records, and the remaining vast volume of unclassified emails that most organizations simply do not have the resources or tools to classify. I will explain each component and how current EIM solutions address these requirements.

Figure 1. The Email Pyramid

Role Based Classification

There are a small number of key individuals in an organization such as the president, general counsel, chief financial officer, and so on whose email should be retained for longer periods of time than other email. The rationale for this is that this email contains a historical record of key activities within the organization. It often contains background information on important events and is frequently used in litigation. For government this email is retained permanently. For private industry it may be retained based on a retention schedule that reflects its importance, and each organization will schedule these records appropriately.

For classification, using tools within the EIM system, these emails are classified based on role. All emails for these individuals are retained and are moved into the EIM system. Identification of the emails for archival and classification is a straightforward process using standard email rules.

Business Records

The middle tier, Business Records, includes those emails that are used to document business transactions. These may contain the only record of an approval for a change order, authorization to proceed for a contract, acknowledgement of a transaction, and so on. Many organizations have their users print these emails and file them in a project or contract folder. Often they remain in the email system, ungoverned and unclassified.

These emails should be moved from the production email system to the EIM system. Ideally automated business processes accomplish this without manual intervention. All notifications, acknowledgements, and other correspondence that is generated or received by a business application should be configured to store these as records within the EIM system, with appropriate metadata. These records are automatically classified based on Document Type such as Contract, Change Order, FOIA Request, and so on.

For processes that are not yet automated, the EIM system should be configured to allow users to drag and drop emails from the email system into the appropriate folder on the EIM system. It is important that the manual process be as easy as possible so that users will adopt the process without significant resistance.

The result is an EIM solution that maintains a complete history of business transactions, to include all supporting documents and emails. This approach supports eDiscovery, FOIA, compliance and audits, making all required information available using standard search tools.

Auto Classification

The remaining emails often number in the multi-millions. There is no practical way to manually identify or classify this volume, and having users do it themselves is usually not feasible. Today, however, there are auto classification tools available to help accomplish this daunting task.

These emails are archived, and the email archiving tools eliminate duplicate emails. This significantly reduces the number of emails being archived.

Auto classification tools automatically classify the remaining emails according to the RM retention schedule. A key component of this approach is to create a “Transitory” big bucket that represents those emails that are non-records. You assign 180 days or a similar rule to this category and in this manner a very significant portion of the email volume, sometimes 60 to 70 percent, is destroyed in a legally defensible manner.

A recommended approach for this type of email classification is to use a big bucket retention schedule. The auto classification engine must be trained, using example emails, to recognize which classification an email belongs to. This task is much more difficult using a typical granular retention schedule.

The result is an email archive that has deleted much of the duplicate and transitory content, and which contains remaining emails that are classified according to the official big bucket retention schedule. This approach is not 100% accurate. Accuracy of 60% or better is considered successful. A good classification engine, using ongoing training, should be able to attain 70% to 80%, or better, accuracy. This “good enough” approach, when compared to no governance at all, meets most legal requirements.


Email management is a complicated subject, and most organizations use an approach that is piecemeal at best. Best practice using current technology is to treat email like any other business record, saving what is important, discarding the chaff, and applying standard retention policy to what is retained.

#FOIA #E-mail #Records-Management #ElectronicRecordsManagement #e-discovery