How to Determine the Best Form for your Taxonomy

By Carl Weise posted 05-17-2012 15:06


The first function of a taxonomy is to help you understand the structure of your knowledge domain at one easy glance.  You should be able to look at the structure and be able to predict where you can find out about the parts of the domain in more detail.  Predictability turns out to be the most important feature of good taxonomy design.  It is necessary to understand the natural categorization patterns of your different user communities, and to balance out the ways that they compete, or conflict, with each other.

Different Forms of Taxonomies


This, the simplest form of taxonomy structure, is a collection of items that have some basic relationship to each other e.g. similarity of attributes or purpose (shopping list, list of files on my shelf), steps in a process, a sequence of actions, activities that are frequently done together, or project roles that you manage.  Usually when you look at a list, you should be able to understand the principle of similarity that brings them together.

The main drawback with a list is that it becomes difficult to scan quickly and make sense of a list over about 12-15 items.  The exception is for lists that are very familiar and have entered your long term memory – e.g. lists of countries.


Once a list gets too long, you usually start breaking it up into clusters and create parent categories for it, creating a tree structure.  The tree structure is the structure you most traditionally associate with taxonomies, subdivided at the top level and sub-categories underneath – e.g. like in a folder structure on our computers.  Most naturally created tree structures are not very easy and predictable to navigate for other people.  The reasons are that we are often inconsistent in how you apply principles of subdivision with, and between, levels.

You might sort things into some folders by types of documents but, in others, you might use people or activities, topics, or dates.  This lack of consistency is what makes tree structures unpredictable and hard for other people to use.


A hierarchy is a tree structure that follows very strict rules about how it is subdivided.  The same principle of subdivision must be consistently applied at every level.  This makes the hierarchy very easy to predict.  A true hierarchy should be exhaustive (it contains all possible topics) and categories must be mutually exclusive (there cannot be any ambiguity or overlaps between them.  There can be no ambiguity in a true hierarchy.  Each topic can only exist in one place in the structure.


A Polyhierarchy tries to solve the problem of naturally occurring ambiguities in our information world, by selectively breaking the strict rules of the hierarchy.  In a polyhierarchy, a topic can have more than one parent, if there is more than one way that people want to look for it.  For example, “pneumonia” might be a topic that could have a parent concept “lungs” for people who are interested in the parts of the body affected and a parent concept “viral illnesses” for those who are interested in causal factors.

A “report” could have a parent concept “document types” as well as “activities”.


A facet is a taxonomy structure that takes into account only one attribute of a piece of information.  Normally, you would use a system of facets, where each facet describes a different attribute of the information.  For example, one facet could list all of the document types, another could list all the business activities, another could list all the job roles, and so on.  Each facet can be a simple list, or a tree or a hierarchy.

Used in combination, facets give a rich description of the content, but each facet also provides a distinct way of organizing and finding the same content.  This can overcome the problem of having different competing ways of organizing the same information among different user groups.  So people who like to organize by document types can find it that way, others who organize by job roles can find it that way, and so on.


A matrix structure is usually two (sometimes three) facets presented in a table format, so that you can explore the combinations of facet attributes where the facets intersect on the vertical and horizontal axes.  Matrix structures work well when you have a limited number of entities to work with, and they are exhaustively described by the facets when used in combination.  One of their benefits is that they can help you spot gaps in your inventory – e.g. when a cell in the matrix is empty.

System map

System maps are visual representations of your knowledge domain, where proximity and connections between entities are used to express the relationships between them.  System maps can be descriptive or conceptual.  Examples of descriptive maps might be a map of a transport network or a map of the arteries in the human body.  Mind maps or concept maps are examples of the more conceptual system maps.  So are process maps.  All of these maps help to organize concepts and entities, and they are often used to communicate the key nomenclature or vocabulary of your domain.

In a web-enabled format, you can also hyperlink further information resources to the elements in the map.

Choosing which structure or combination of structures you want to use will depend on:

The number of entities you want to cover in your taxonomy – lists, matrices and system maps tend to work better at smaller number of entities; trees, hierarchies and polyhierarchies work in the mid-range, and facets work well for very large numbers.

The extent to which you want to give a visual representation of a domain that is easy to comprehend and navigate – system maps and matrices do this best, facets don’t support this function quite so well, because users have to decompose their queries into the various facet elements.  The feature that is common to all these structures is their ability to give a predictable structure so that you can navigate and find the information you are looking for.

So, which is the best form of taxonomy for your purposes?

Tell us about your efforts to create a taxonomy in your organization.

What lessons did you learn in creating your taxonomy?


I will be speaking at the following events:

  1. June 5th– 8th, 2012 AIIM ERM Master class in San Francisco, CA
  2. June 12th, 2012 Info360 ECM Practitioner Pre-Con in New York, NY
  3. June 19th – 22nd, 2012 AIIM ECM Master class in Houston, TX


#ERM #ECM #TaxonomyandMetadata #ElectronicRecordsManagement