Search vs Folders - When and Why

By Mike Clarke posted 07-10-2011 13:12



This debate of search vs folder has been raging in the blogosphere for a few years now. Users are now used to both approaches, thanks to Windows Explorer and Google.  However few users understand how to fully utilize either or how they work, at more than a superficial level.  and...  they really should not need to.  So it comes down to the solution architects and designers to create solutions that users can intuitively use to enter and find documents/content, etc. as they perform their work.

Management, both in the lines of business and IT,  needs to get their head around what the users are trying to accomplish in order to find the best way to enable with EDMS/ECM/ERM technologies. As a result they can then tell an architect what needs to be delivered at a business level, which can then be translated to a solution design.   Technology choice does not come into it at this level.  Most projects I have seen have not been taken seriously enough by management.  There most often has not been a detailed task analysis of information-based business flows.  The result is that they are shooting in the dark and sometimes they get lucky, probably from input by a seasoned SME that has seen what has worked in the past.  Most often they don't get it right and the users quietly go about their business while ignoring the multi-million dollar investment in "productivity" tools.  The decision to search or use folders is a fine grained technical solution that really should come as a natural conclusion of a detailed workflow analysis.  

Granted, not all workflows are well structured.  In unstructured workflows there is usually a very complex workflow hidden by the amazing ability of the human brain to make decisions.  That still needs to be mapped out because what happens when personx, walking around with the complex set of decision making rules in their head, retires?  Also, how can you possibly design a system to support a complex decision making process when you don't understand it?

So just when do you use search or folders, or both?  I might as well throw in my two cents worth.

The Question:
Should users of document/content/records management systems, or plain old network drive file storage, be able to search or should they navigate a folder structure?

The Answer(s):

  1. Folders:  These are actually just data.  There is no physical "folder".  It is a concept that humans easily understand and, so, is widely used.  A folder structure is a set of containers, holding electronic files and or other types of content, which can be organized into a hierarchical structure like a family tree or a filing cabinet.   When a file is saved into the folder structure it is merely linked, in a database, to the folder structure node (ie folder).  When the folder is presented the code looks up the files linked to the folder and presents it to the user in a graphical representation of a folder structure with a list of linked files.
    1. When the organization of files follows a well known and unambiguous hierarchical structure then a folder structure makes sense.
    2. When the number of files in a folder exceeds 100 files or so then it becomes cumbersome for a users to navigate page1,2,3,4,5.... of list of files in a folder.   This is frustrating when a user is looking for the one document they need to perform their task at hand, not the entire collection.
    3. When thee are too many files in one folder then sub-folders, with some type of organization, such as date, department, etc. needs to be created.  This makes for a "deeper" folder structure.  This may also be a cause of frustration for users since they will need to click several times to find what they are looking for.
    4. When there is little or no metadata (data describing the file such as author, date, ID, etc.) attached to the file then the placement in a folder structure may be the best way for users to find it.
    5. Folders are heavily enforced in user interfaces of EDM/ECM/ERM systems.
    6. Folder structures work best when the end users are more skilled and are used to dealing with complex collections of information.
    7. Folders have been used in order to provide more generic access to information, by providing a global or departmental view of the organization of information, which is most useful when the business process(s) involving that information are not well defined or known.
    8. Folder structures work well for records management to reflect a "taxonomy" and classification structure for records and more easily enable the application of retention policies and holds on specific  types of records.
  2. Search:  Again, this is data enabled.  When the user enters a value the code searches a complex structure of tables containing rows and columns of data and returns a list of documents, folders and or other objects that meet the criteria.  The problem with the "Google" search is that it is generally used to create a list of results based on a "fuzzy" matching of results.  If you want an accurate set of results you need to know exactly what you are searching for, and the data needs to be present for the search to find it.  So the person importing a scanned image of an invoice needs to enter the PO and invoice number and whatever other search criteria you may later require.  If they don't you will not see it in the search results.  However you may be able to navigate a folder structure to find it...
    1. Search requires metadata to be available and accurate.
    2. Search may also include the text contents of a file if it can be read as text.  If it is in a graphics file or an audio file or other binary format the contents of the file cannot be searched and the metadata must be available.  This is most commonly found in engineering files, scanned images and multimedia.
    3. Generic search forms list the possible attributes (columns in a  table) and allow you to locate a set of rows matching criteria you enter.  This is fine for occasional searches, especially by well trained or highly skilled users that understand both the information itself and also how searches actually work.  They tend to be a problem for, due to their slow response, number of clicks and complexity, when a user has a need for fast results and for high repetition tasks.
    4. Generic search forms are essential for providing power user and admin search across various types of documents, which span folder structures and business processes.
    5. Generic search forms may produce unexpected results such as the inclusion of confidential or sensitive information in search results.  This may violate regulatory compliance and the results may be very serious.  This forces the architect to always create an underlying security model that protects such information assets and displays it only to those who need to see it.
    6. User interfaces which are specifically designed to support a business process can have dedicated searches embedded in forms.  For example Documentum's xCP platform has a case management tool called TaskSpace where specific queries can be embedded in forms to display specific results and to query external systems.
    7. The embedded searches assume that the searches have been pre-defined and that a task-based analysis has been done in advance.
    8. If the structure and search requirements for information is well defined and there are very large numbers of files in the repository the folder structures can be replaced with a hierarchical structure of pre-defined searches which can be navigated by end users.  This will require heavy customization to most applications.
  3. Combination of Search and Folders:  Inevitably there needs to be a combination of folders and search for a variety of reasons:
    1. Admin and power users need to do generic and ad-hoc searches.
    2. Admin and power users sometimes need to navigate folder structures to find files or folders that have been damaged or flawed or require maintenance such as corrections or security updates.
    3. Records managers need taxonomy and classification based folder structures to manage retention, holds, etc. effectively.  In some EDMS systems you can cross-link folders and or files from a user-oriented or functional type of folder structure to a records management system, providing the best of both worlds.
    4. Search based navigation trees can be used to allow users to navigate the information in a different perspective or "facet" than those of admins or records managers, reducing the need for complex folder structures, which may create performance issues in some cases and maintenance issues of the structures should ever need to change.
    5. End users need to have predefined searches available to them whenever possible to speed up the tasks at hand and also usually a simplified and or targeted generic search for usual or unexpected tasks.
    6. End users need to have the user interface reflect the context of the tasks at hand so that the files they need at a point in time are readily at hand.
    7. If a folder structure is well understood and not overpopulated it represents a low cost way of solving the information retrieval and upload process.
    8. Folders are, sometimes, an easy way to define security.  However when the same types of documents need different layers of security, then you need to build deeper folder structures.  A better way to think about security is to build a logical security matrix in a spreadsheet and use listeners for when files are imported or updated so that a specific security template or access control list template or retention policy can be applied to each document or record.  In Documentum these are enabled via "type based objects" (TBOs).

#find #architect #folders #SharePoint #repository #Search #ScanningandCapture