Continue in 2 seconds

Naming Conventions and Semantic Consistency

  • December 01 2004, 1:00am EST

Wouldn't it be nice to work in a world where nomenclature inconsistencies were limited?

For those of you familiar with the Bible story of Adam in the Garden of Eden, you may recall that Adam is charged with naming all the animals with which he shared the garden. There is a joke describing an exchange between Adam and the Creator regarding the naming of the hippopotamus. When the beast is brought before him, Adam decides to call the animal a "hippopotamus." When asked why, Adam replies, "Because it looks like a hippopotamus."

In the data world, information management professionals are also occasionally called upon to name things (e.g., data sets, columns, objects, tags). Unfortunately, sometimes situations occur where several things end up with the same name. Alternatively, when more than one person is (independently) involved, the same thing ends up with more than one name. These kinds of problems multiply when we decide to share those things with even more people, each of which potentially has even more names for those things. This is a problem that I refer to as "semantic inconsistency," by which I roughly mean that there are differences between what a thing is and what it is called.

This problem exists within enterprises consisting of independently developed applications (along with underlying data models) that essentially evolved in a vacuum. An example involves the numerous ways that I have seen customer account number fields named: ACCT, ACCOUNT, ACCTNUM, ACCT_NUM, ACCOUNT_NUM, ACCNMBR, ACCT_NBR, etc.

It would be nice to work in an environment where these kinds of dissimilarity of nomenclature were limited. However, as more information is being extracted from one data set for migration or consolidation in other data sets, as well as an increase in "standardized" data exchanges (read: XML), there is only a greater proliferation of the semantic consistency problem.

One approach we are currently exploring with our clients is the introduction of naming conventions. Naming conventions are not new in the world of computing -- there are existing documented conventions for programmatic object naming whose ages rival that of the modern computer, and there are certainly naming conventions outside the computer world that have been around for a much longer time. However, as we are seeing a greater desire to consolidate meta data and define taxonomies, the use of a naming convention helps in reducing object name disparity.

Naming conventions usually follow a logical scheme based on some combination of the thing's primary objective, one or more modifiers that help describe what it is used for and its (usually abstract) data type. For example, if we wanted to track our customers' ages, we might maintain their birth dates. These birth dates might be stored in a field called CUSTOMER_BIRTH_DATE. The primary objective is that it refers to a date associated with someone's birth; the specific "someone" - the customer - modifies the specific birth date and the fact that it is a "date" completes the name.

The value of a naming convention lies in its simplicity and its descriptiveness. A good naming convention, when it accompanies a well-designed set of abstract data type definitions managed as meta data, will provide a means for easily defining new data objects while conveying semantic meaning to existing named objects. In addition, a nicely crafted convention will consolidate naming across platforms, working well for business specifications and logical models, as well as XML namespaces. In fact, engineering a good naming convention along with proper application tools will allow a seamless automated generation of XML schemas based on logical specifications.

Lastly, the act of convening a team and holding meetings to define a naming convention is a critical initial step in developing a meta data strategy. That is due to the fact that the process of identifying key class words and effective modifiers reinforces the connection between an underlying data model and the business applications it is intended to support. Performing this task with some of our clients has proven the point that a good naming convention clarifies commonly used business terminology and how those terms are represented in the databases.

The humor in my earlier anecdote belies some interesting subtext in that humans take their guidance for naming things from potentially many different sources. Yet regardless of what the predominant drivers for naming are, it is worthwhile to talk to some other folks about them so that you'll ultimately agree on the results.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access