SharePoint and Dublin Core

By Alfred de Weerd posted 06-03-2014 16:29

  

 

The classic Dublin Core Metadata Element Set is a vocabulary of fifteen metadata elements which can be used in describing documents (or other information assets). The metadata elements are generic by design and they are applicable for a broad range of resources, which is precisely why they pop up in almost any discussion on metadata.

The Dublin Core elements are available as Site Columns in SharePoint. The entire group of fifteen elements is available using the Dublin Core Columns content type. A number of columns are also in use as standard columns for document libraries. When your requirements are matched by the Dublin Core columns, you have two options: use the Dublin Core Content type or use your own content type, in which you add a selection of the Dublin Core columns.

The 15 elements of the Dublin Core set are given in table below. In the table, I combine information from the Dublin Core websitewith the translation to SharePoint. We all now SharePoint and we all now Dublin Core so what would be the use of reading this? Well:

  • It gives you greater insight in many of the SharePoint standard columns you have be using so long;
  • As stated above, Dublin Core tends to pop up in many metadata discussions, so it’s good to have an idea of what it is about (as we will see in a next post, there is actually a lot more to Dublin Core then just the fifteen elements presented in the table)
  • When you have to describe your own metadata, the definitions from the Dublin Core are an excellent started point (unless you are working in environment dominated by academics, I advise you to make the descriptions a bit less formal)
  • Dublin Core is the basis for almost every metadata standard, both formally and informally. To the first category belong ISO Standard 15836:2009 and ANSI/NISO Standard Z39.85-2007
  • It can be a checklist to evaluate your own metadata set. This doesn’t mean that you have to include all elements in your metadata scheme however.
  • In the descriptions of the elements, guidance is sometimes given on implementation which you can reuse in your SharePoint metadata
  • When doing the information analysis, you are likely to come across people with a background in library sciences or archiving. Understanding Dublin Core will help you bridging the communication gap.

 

 

 

 

Contributor

Definition

An entity responsible for making contributions to the resource.

Comment

Examples of a Contributor include a person, an organization, or a service. Typically, the name of a Contributor should be used to indicate the entity.

SharePoint considerations

The SharePoint Created by and Modified by columns are available by default in any document library. Created by, the author of the initial version of the document, is not made visible in the default view, but Modified by, the author of the  last version of the document, is. Both are single values, whereas the intention of Dublin Core is to capture all (relevant) contributors. The Contributor column of the Dublin Core columns content type consists of multiple lines of text, which makes it more flexible, but there is also no validation.

 

Coverage

Definition

The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant.

Comment

Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates. Temporal topic may be a named period, date, or date range. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Recommended best practice is to use a controlled vocabulary such as the Thesaurus of Geographic Names [TGN]. Where appropriate, named places or time periods can be used in preference to numeric identifiers such as sets of coordinates or date ranges.

SharePoint considerations

The coverage element is not provided by default. It is very generic, where most use cases benefit from being specific. A generic metadata field like Coverage would only serve equally generic use cases, like searching for documents created at a single geographical location. The Coverage column of the Dublin Core columns content type consists a single line of text.

 

Creator

Definition

An entity primarily responsible for making the resource.

Comment

Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity.

SharePoint considerations

The SharePoint Created by column is available by default in any document library and has the same meaning as the Dublin Core creator field. It is not made visible in the default view. Related fields are Modified by and Created (refer to Contributor and date in this table, respectively). The Creator column of the Dublin Core columns content type consists of a single line of text.

 

Date

Definition

A point or period of time associated with an event in the lifecycle of the resource.

Comment

Date may be used to express temporal information at any level of granularity. Recommended best practice is to use an encoding scheme, such as the W3CDTF profile of ISO 8601 [W3CDTF].

SharePoint considerations

The SharePoint Created and Modified columns are available by default in any document library. Created, the date of initial creation of the document, is not made visible in the default view, but modified, the last date the document was Modified, is. Both are single values, whereas the intention of Dublin Core is to capture all (relevant) dates. The Dublin Core columns content type contains the Date Created and Data modified columns of type Date. Having Date types seems out of par with having free format fields for contributor and creator.

 

Description

Definition

An account of the resource.

Comment

Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource.

SharePoint considerations

The description element is not included by default. Think carefully if this field is required. A combination of title and keywords may be enough for your purpose especially in combination with the standard text search capabilities within SharePoint. The Description column of the Dublin Core columns content type contains multiple lines of text.

 

Format

Definition

The file format, physical medium, or dimensions of the resource.

Comment

Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media Types [MIME].

SharePoint considerations

The Format column is not included by default. SharePoint does provide the File size column, which provide additional information to the type. This is in line with the examples given by Dublin Core in their elements list.The Format column of the Dublin Core columns content type consists of a single line of text, which makes it more flexible, but there is also no validation.

 

Identifier

Definition

An unambiguous reference to the resource within a given context.

Comment

Recommended best practice is to identify the resource by means of a string conforming to a formal identification system.

SharePoint considerations

The SharePoint ID column is available by default in any document library. This is not a universal ID however, suitable to match the intention of the Identifier column. It is simply a number starting from 1, within each document library. The Document ID feature does provide functionality in line with Dublin Core Identifier element. Before using the Document ID feature in your content management system, you must first enable it for the site collection(s) your documents are hosted in. When the service is enabled, a new column is automatically added to the Document and Document Set content types. The Resource Identifier column of the Dublin Core columns content type consists of a single line of text, which makes it more flexible to use your own formats, but the benefits of using the Document ID feature are also missing (Unique ID, link to document is stable, even after moving the document).

 

Language

Definition

A language of the resource.

Comment

Recommended best practice is to use a controlled vocabulary such as RFC 4646 [RFC4646].

SharePoint considerations

The Language column is not available by default in a document library. The Language column of the Dublin Core columns content type is of the Choice type. It provides a list of choices that will be workable in most situations, but it is not RFC 4646 compliant.

 

Publisher

Definition

An entity responsible for making the resource available.

Comment

Examples of a Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the entity.

SharePoint considerations

The Publisher column is not available by default in a document library. The purpose of this field is to identify the entity that provides access to it. Within SharePoint, this may be the creator as specified in the Created By column. In many cases it will be an organizational unit. The concept is related to ownership as often encountered in SharePoint solutions. The owner is not only responsible for rights on the document, but also for the quality of the document. In some cases an Owner column is useful, in other cases the ownership is determined on a higher level (typically site or site collection). The Publisher column of the Dublin Core columns content type consists of a single line of text.

 

Relation

Definition

A related resource.

Comment

Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.

SharePoint considerations

The Relation column is not available by default in a document library. Relationship as meant within Dublin Core is very generic. More useful in the SharePoint context would be referring to other document either by Document ID or with links. Using document sets would also be a way to relate documents to each other. The Relation column of the Dublin Core columns content type contains multiple lines of text.

 

Rights

Definition

Information about rights held in and over the resource.

Comment

Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights.

SharePoint considerations

The Rights column is not available by default in a document library. Note this is something different that the access rights. The Rights Management column of the Dublin Core columns content type contains multiple lines of text.

 

Source

Definition

A related resource from which the described resource is derived.

Comment

The described resource may be derived from the related resource in whole or in part. Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system.

SharePoint considerations

The Source column is not available by default in a document library. Source may be regarded as a subset of Relation. Evaluate whether you really have a need to use a separate column for Source. The Source column of the Dublin Core columns content type contains multiple lines of text.

 

Subject

Definition

The topic of the resource.

Comment

Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlled vocabulary.

SharePoint considerations

The Subject column is not available by default in a document library. For SharePoint, the best general implementation thinkable would be to use Enterprise Keywords. Note that these are general keywords. In most cases you will need specific metadata as well. The Subject column of the Dublin Core columns content type consists of a single line of text. It also contains the Keywords column, containing multiple lines of text. It seems Microsoft added its own flavor by separating Subject and Keywords.

 

Title

Definition

A name given to the resource.

Comment

Typically, a Title will be a name by which the resource is formally known.

SharePoint considerations

The SharePoint Title column is available by default in any document library. It is not made visible in the default view. SharePoint also has two Name columns, which refer to the file name and which are automatically updated when uploading or saving. One of the Name columns has the Document Edit context menu attached to it, so in most cases, you will need it. The Title column of the Dublin Core columns content type consists of a single line of text. Since there are also standard Title and Name columns, it seems superfluous.

 

Type

Definition

The nature or genre of the resource.

Comment

Recommended best practice is to use a controlled vocabulary such as the DCMI Type Vocabulary[DCMITYPE]. To describe the file format, physical medium, or dimensions of the resource, use the Format element.

SharePoint considerations

The SharePoint Type column is available by default in any document library. It is also one of four columns that is visible in the default view. Since only files can live within SharePoint, the type is always a file type. These file types do not match with the DCMI Type Vocabulary. In all but some very generic cases, the file type will be of more value to future users then the DCMI Type. The type is made visible by a pictogram. The Resource Type column of the Dublin Core columns content type consists of a single line of text. It is more flexible then the default type column, but it also does not have its advantages, like being determined automatically and be shown by a symbol easily recognized by the majority of users.

Dublin Core provides creators and dates as singular elements, whereas in most cases, you want to see what where the contributions of creators on specific date. So there is a triad contributor – contribution – date. SharePoint provides this metadata in the version history (when turned on) in conjunction with the change history in Word (when turned on). In this way you have a much more detailed account on what happened historically with the document then metadata fields could provide. Also, metadata like this is commonly put in the version history in the document itself. In some cases (like eDiscovery), you would may have a need to search for a specific date or contributor without knowing in which document to search. Targeted metadata like the Dublin Core columns Date and Creator are then ideal. SharePoint standard search might help you as well, but the name of the creator is then not targeted as a creator-name and the date is not targeted as a specific creation-date. If you now in which document you want to search for a specific date or creator (which is commonly the case) then the mechanisms above are more than sufficient. Be aware that standard search can still find the creators and dates when they are mentioned in the version history or some other place in the document. The message here is to keep the metadata use cases in mind before you start introducing new metadata fields, but this is of course true for any metadata field. It is also possible to incorporate the SharePoint version in a Word document, using template based transfer. Since the version is a system column however, you can’t use the Quick Parts in the simple way, but you have to use SharePoint labels to transfer the values to Word.

Dublin Core stresses the importance of having the same meaning for all metadata fields. When using SharePoint standard columns for an entity with another name in the Dublin Core set, you must verify that no problems arise as a consequence. When you are keeping records up to 10 years, then to be destroyed, this will probably not the case, if you have documented your decision in the information architecture or some other place. When your records are meant to be kept ‘eternally’, like documents created within governments, this might not be the case. You may need to use the Dublin Core columns or, alternatively, rename the columns when the records are transferred to the central archive.

Dublin Core qualifiers

In the previous section we have seen the basic elements of Dublin Core (the Simple level). The model can be extended by using qualifiers (the Qualified level), which provide additional meaning to the metadata value:

•Element Refinement. These qualifiers make the meaning of an element narrower or more specific. A refined element shares the meaning of the unqualified element, but with a more restricted scope. A client that does not understand a specific element refinement term should be able to ignore the qualifier and treat the metadata value as if it were an unqualified (broader) element. The definitions of element refinement terms for qualifiers must be publicly available.

 •Encoding Scheme. These qualifiers identify schemes that aid in the interpretation of an element value. These schemes include controlled vocabularies and formal notations or parsing rules. A value expressed using an encoding scheme will thus be a token selected from a controlled vocabulary (e.g., a term from a classification system or set of subject headings) or a string formatted in accordance with a formal notation (e.g., "2000-01-01" as the standard expression of a date). If an encoding scheme is not understood by a client or agent, the value may still be useful to a human reader. The definitive description of an encoding scheme for qualifiers must be clearly identified and available for public use.

A full list of all qualifiers can be found at the Qualifiers list.Note that this approach is valuable only if you need to adhere to the Dublin Core metadata elements. In this case, the qualifiers do provide you the opportunity to give extra information on the metadata. In most cases within the SharePoint context, you will be better of using a more specific metadata field name. So instead of specify a date and then noting that is the creation date, use the default Created metadata column.

 

References

http://dublincore.org

http://dublincore.org/documents/dcmi-type-vocabulary

 http://dublincore.org/documents/usageguide/elements.shtml

http://dublincore.org/documents/usageguide/qualifiers.shtml

 

 

 

 

 



#sharepoint #metadata #Core #TaxonomyandMetadata #Dublin #SharePoint #EnterpriseContentManagement
0 comments
1700 views