Continue in 2 seconds

Is Data Modeling an Arcane Sport?

  • September 01 2002, 1:00am EDT
More in

I was browsing an online bookstore the other day, trying to remember the name of a female author whose book on data modeling I much admired but had misplaced. While sifting through the available titles, I was quite honestly startled at the current state of books on data modeling. In the past, one could find dozens of titles pertaining to data modeling; however, this is no longer the case.

Why is this? Has data modeling gone out of style? Two things have happened. First, as we approached Y2K, there was a strong emphasis on packaged software installation over custom systems. Because packaged software comes with data structures that are defined by the software, customers who install the package have no need to worry about data designs.

The second influence that has led to the decline of industry interest in the techniques of data modeling is the ready access of relational database management systems (DBMSs) for very low cost. Microsoft's pricing of Access and SQL Server created a sense that relational databases are useful to anyone with a personal computer and that special skills such as data modeling are unnecessary.

The world does not end if a few less book titles are available; but without some of this work being published, one risks ignoring some important recent developments such as the concept of data model patterns, the notion that most data models have a common structure based on the subject area and that it is not necessary to reinvent the wheel. (See David Hay's book Data Model Patterns for more on this topic.)

Another important advance is the recognition that a fully normalized model may not serve all needs and that a different approach – one called dimensional modeling – fulfills quite different types of system needs. Created by Ralph Kimball and explained in his important text, The Data Warehouse Toolkit, dimensional modeling addresses two traditional problems faced by query systems. Most relational database management systems take too much time to respond to complex queries, and business users find that a traditional normalized data structure is too complex and confusing for them to use. Dimensional modeling produces data designs that perform better and are much more intuitive. However, the designs are more fragile and must change as user query needs change.

A good data model is extremely important to a successful custom development effort. It reflects the business rules that define a business and ensures that prior limitations from old technology or poor design do not restrict the future growth of a business. A poor data model cripples the DBMS, invalidating any ability to provide integrity checks. A good model, coupled with a good relational DBMS, protects domain, entity and referential integrity of the data:

  • Domain integrity refers to the accuracy of data values relative to the domain of permitted values.
  • Entity integrity refers to the accuracy with which each row of a table represents a unique entity of the real world.
  • Referential integrity refers to the accuracy with which data elements refer to entities. An example of referential integrity is a purchase order, which contains references to both the customer who placed the order and the products being ordered. Referential integrity concerns the integrity of those references, that such a customer really exists and that such products exist.

When you put fact 1 (interest in data modeling may be declining) together with fact 2 (data modeling is still vital) that gives you fact 3: finding necessary data modeling skills may be a challenge for your company. In fact, outstanding data modeling skills do still exist in many companies. However, as those talented individuals retire or move on to other roles, replacing them will be difficult. How hard should you strive to recruit data modeling expertise? For organizations committed to the use of packaged solutions, there is little reason to recruit or grow data modeling skills. For organizations embarking on the design of a custom application, data model patterns are certainly the place to start. David Hay's models are available very inexpensively, but some find them too simplistic to serve as final logical designs. Several DBMS vendors, including NCR and IBM, have packaged their customer experience and offer extensive models for specific subject areas. While providing excellent patterns, these models are also too generic to serve as the final logical data model for an enterprise. There is no avoiding the fact that any data model pattern must be extended to meet the specific needs of a custom design.
For these more specialized data modeling skills, companies often have to turn to outside experts – whether a systems integrator such as Accenture, a DBMS vendor such as Teradata Professional Services or small entrepreneurs such as Ralph Kimball and Len Silverston.

If your organization has this precious data modeling talent on staff, treat it well. If not, then it will be wiser to hire the skill from a consultant for the short duration needed for the modeling exercise, instead of trying to hire the skill as permanent staff.

Register or login for access to this item and much more

All Information Management content is archived after seven days.

Community members receive:
  • All recent and archived articles
  • Conference offers and updates
  • A full menu of enewsletter options
  • Web seminars, white papers, ebooks

Don't have an account? Register for Free Unlimited Access