Meta data standards are not often the first thing considered when developing content management systems; but, sooner or later, the issue comes to the forefront. Sometimes it is because administration becomes unwieldy, search results are imprecise or users ask for features, such as navigation, that cannot be introduced without meta data. When you hit one of these problems, questions arise. You know you need meta data, but where do you start? What should be included? How do you get content creators to actually add meta data? Once you have it, how do you exploit it?

Meta data serves several purposes in content management systems (CMSs) including administration, search and navigation, policy enforcement, access control and version control. Many of these tasks are managed by CMSs out of the box; others must be implemented on a custom basis. When implementing additional meta data, keep in mind basic principles of meta data design. Track only information that is needed. Use defaults as much as possible. Use controlled vocabularies. Always keep content creators and end users in mind. Here are some specific recommendations.

Administration: Not all content is worth keeping forever, and expiration or archive dates should be used to delete or off-load unnecessary content.

Meta data can also describe restrictions on the use of content. The Dublin Core, for example, includes an attribute to describe digital rights governing the document. Additional attributes can be used to trigger the inclusion of boilerplate text. For example, a financial services firm may want a standard legal disclaimer included in all marketing solicitations sent to potential customers.

Content is often targeted to particular users and access controls on directories and folders can keep privileged information from prying eyes. A more flexible approach is to associate access control information with the content itself rather than with the storage area of the content. This meta data can be used to control how search engines respond to queries or trigger the inclusion of boilerplate text describing the restricted use of the content.

Search and Navigation: The majority of content meta data describes content. Meta data attributes most commonly include the name of the author, department or organization responsible for the content, description of the content, keywords, creation and revision dates, intended audience and categories. These attributes are especially useful for search and navigation.

Navigation is the divide-and-conquer approach to finding content. With search, you start with your best guesses at terms that appear frequently in the documents that interested you while not appearing too frequently in all the rest. We all know how well that works. Navigation takes a different tack: it methodically eliminates what does not interest you using a top-down approach. This approach requires taxonomies implemented, in part, by meta data.

Taxonomies are often hierarchical and provide an intuitive mechanism for drilling from general topics (e.g., financial services) to narrow topics (e.g., home equity line of credit). Non-hierarchical taxonomies support selection from linear lists (e.g., Home Page, Products, Services, Press, About the Company, Contact Us) that again bring the user to focused content. Most CMSs require multiple taxonomies.

A content management application may use geography, organization, product and a host of other taxonomies to categorize content. Each of these taxonomies is associated with a facet that structures the content. (If you are familiar with OLAP, a facet serves a similar function as a dimension.) When developing a meta data standard, identify the facets along the lines of how users think when dealing with content.

This is easier said than done. To start, look at how content is described to find clues to relevant facets (e.g., from the description "this is the sales quota report for the Northeast region," you can determine document type and geography). Also look for negative examples. Scan the results returned by a search engine and pick several irrelevant hits. What is the first characteristic you notice that indicates that content is irrelevant? If you search for sales information on a product and get a technical design document, you can readily tell that it is irrelevant by function or intended audience. If you search for images of a product and get text documents describing the product, you know the results are the wrong content type. Function, intended audience and content type are good candidates for facets in content management systems.

When defining meta data standards for enterprise content management, focus on key tasks: administration, and search and navigation. Much administrative meta data is tracked by content and document management systems by default; but additional elements, such as archive dates, may be required. Use meta data to describe content in such a way that search engines and navigation systems can utilize it. Identify facets for organizing content. Most importantly, define the standard with stakeholders, content creators, system administrators and end users in mind.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access