In the world of data management, reference data remains largely unexplored terrain. Its existence is known to IT professionals, and it is a territory all must pass through on their journey to successful database design. Yet most travelers hurry along to the more interesting places where they can find the data that has greater significance to business users – data on parties, roles and activities that combine to generate the transactions of the enterprise. However, it is worthwhile to survey reference data and push beyond the limited horizons that constrain our understanding to gain a better appreciation of this long-neglected area.

One of the most striking things that appears when reference data is examined is its structural regularity. In most databases, reference data consists of many small tables each containing a single primary key column, usually a code or acronym. (See Figure 1.) There are a few nonkey columns, which nearly always include a long text column that is a name. Countries, currencies, industry classifications and credit ratings are all examples of reference data that fit this pattern. These tables tend to have relatively few rows and change infrequently, at least when compared to the rest of the database. This structural simplicity, low data volume and slow rate of change seem to be the reasons why reference data is generally regarded as unexciting. However, if we look at the behavior and usage of reference data, patterns emerge that suggest a distinct class of data that has unique features and requires special management.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access