One moment, processing...

Message Image  

6 Practical Tips for Designing Taxonomy

By Johannes Scholtes posted Jul 08, 2010 4:59 AM

Now I know this is not the most exciting topic, but it is worthwhile because the role of taxonomy is becoming increasingly important to the complete search experience as well as the automatic classification of documents in various applications (see also the footnote). Designing the structure for a dynamic taxonomy is not that hard if you follow these tips: Tip 1: Use a Dynamic Taxonomy. When one takes the multi-dimensional approach with a traditional static taxonomy, each additional dimension will result in an explosion of possible relationships. This makes it very hard to maintain and setup several different taxonomies. However, in a dynamic taxonomy for faceted search, the multiple relationships between the concepts are dynamically inferred and also immediately visible to the end user. Therefore, they are closely related. One of the benefits is that it is possible to suppress the combinatorial explosion ( by combining different dimensions of different concepts in one multi-dimensional structure. Tip 2: Use fundamental facets: the best approach is to break down the dynamic taxonomy into fundamental facets, each covering one specific dimension and together able to cover the entire universe of discourse. In a way, this is similar to traditional normalization as it is implemented in relational databases. Examples of such fundamental facets are: • Chronological order • Alphabetical order • Spatial / Geometric order • Simple to complex order • Canonical order • Increasing quantity / quality order Tip 3: Find logical hierarchical relations: Once the fundamental facets or dimensions have been determined, the next step is to determine hierarchical relationships, or in other words, find a IS-A ( ) hierarchy for objects in the facets. For example, “concept C” is a subset of “concept P” as part of a Child-Parent relationship, or more specifically, the concept “digital camera” is included in the concept “consumer electronics”. Make sure to use a clear and consistent way of ordering children of concepts in facets. If you don’t, then you will confuse your users. Tip 4: Next, apply a number of known and proven principles to build up the facets. Examples are: • Principle of division: organize your taxonomy as a set of independent “orthogonal” sub-taxonomies. • Principle of mutual exclusion: avoid conceptual overlap between the different dimensions. • Principle of relevance: use your division only in such a way so that it improves access. By using these basic principles and by ordering your taxonomy in a logical manner, for instance from large to small cardinality you can use a structured approach to develop a dynamic taxonomy. Tip 5: Also, be aware that taxonomies need to be maintained. Nothing is more disruptive than a badly maintained legacy taxonomy. This is also where text mining and content analytics can make a huge difference. They can help you to maintain your taxonomy, inform you of new relevant terms, clean-up legacy terms, and outline additional dimensions when things really change. Tip 6: Last but not least, make sure to test your taxonomy on your user community to make sure it is truly practical. Various example and many more tips on the process of designing taxonomy can be found here: Footnote: 1. One of the properties of taxonomies that are used for faceted or exploratory search (see for more information) is that they need to be constructed of different (multi-dimensional) classification schemes. In other words, documents or multi-media objects need to be classified under more than one concept. These individual concepts are the facets used when searching and navigating. 2. The choices made are also relevant for the automatic construction of the facets by using text mining or other content analytics (see: for more information).