Shakespeare may have said that a rose by any other name would smell as sweet, but the bard was probably not an information management professional. In the data world, having more than one name for a specific thing introduces confusion and possibly a conceptual problem with uniqueness and referential integrity because the same character string may be used to refer to more than one object, and one object may have many character strings as handles. This month, we'll start to look at approaches to managing what we could call "naming meta data."
When deciding what color to paint an exterior wall, there is a significant difference between red, crimson and orange red. However, replace one of these shades with another in a traffic light, and people will still stop when that color is lit. It is the context and the application that dictate whether those name strings represent two separate objects or just one. While people can discern both the similarity and distinction between two things (despite the words we use to refer to those things), computers have a more difficult time making the distinction.
This is one of the more significant issues that must be addressed as part of a meta data strategy. This is dealt with within the ISO/IEC 11179 Metadata Registries standard (http://metadata-stds.org/11179) by characterizing the differences and similarities between data elements, value domains, data element concepts and conceptual domains. The following definitions are taken from Part 1 of the 11179 Standard:
- A data element is a fundamental unit of data that an organization creates, manages and disseminates.
- A value domain is the set of permissible valid values for a data element.
- A data element concept is the concept of which data elements form its extension, without reference to a specific value domain.
- A conceptual domain is the concept of which value domains form its extensions, without reference to a specific value domain.
To illustrate these ideas, a data element is an indivisible object bound to a representation (possibly incorporating a value domain, data type, unit of measure and a format specification), such as a "State Code," "Street Address" or "Product ID." A data element concept refers to the perception of the data element. For example, "employee compensation" may evoke the understanding of an amount of money given to an employee, but remains a data element concept because it is not specified in terms of a currency or a time period over which the employee is paid, or whether that amount is paid as salary, bonus or other benefits.
Similarly, there is a difference between a value domain and a conceptual domain. The notion of states of the United States is a conceptual domain, consisting of the set of U.S. states. The domain is conceptual, however, because the representation of permissible values is not specified. Alternatively, we might have multiple value domains associated with the conceptual domain of U.S. states:
- Full Names (Alabama, Alaska ... Wyoming, etc.)
- U.S. Postal Service Codes (AL, AK ... WY, etc.)
- Federal Information Processing Standards (FIPS) codes (01, 02 ... 56)
Each value within each value domain has a one-to-one mapping to a value in the conceptual domain. As long as we know the context associated with a specific value (i.e., the associated conceptual domain), we can determine what that value represents. As long as we know that the value domain is FIPS codes and the conceptual domain is U.S. states, we know the value 56 refers to the concept of "Wyoming" and excludes any other meaning.
Knowing what things are modeled and how they are referenced within a system allows one to effectively tease out the more interesting relationships and dependencies within a data set. These notions provide some low-level building blocks for evaluating meta data. If you have ever participated in a system migration, data standards definition project, or a business intelligence/data mining project, it should be clear that assessing, documenting and, most importantly, managing this kind of meta data is critical to project success.
Register or login for access to this item and much more
All Information Management content is archived after seven days.
Community members receive:
- All recent and archived articles
- Conference offers and updates
- A full menu of enewsletter options
- Web seminars, white papers, ebooks
Already have an account? Log In
Don't have an account? Register for Free Unlimited Access