OCT 1, 2006 1:00am ET

Related Links

Visiting Nurse Service Cares About Cloud Security
October 25, 2011
Light at the End of the Silo
October 28, 2010
Pitney Bowes Releases Enhancements to MapInfo Professional
September 13, 2010

Web Seminars

Suit Yourself: An Effective Recipe for Self-Service Analytics
March 20, 2012
How to Narrow the IT/Business Communication Gap
March 21, 2012
Enhance and Expand BI with Mobile
Available On Demand

Is Data Modeling Sufficient for Database Design?

Print
Reprints
Email

Data modeling has been with us for several decades now and has been wildly successful. It has provided IT professionals with a set of tools, techniques and methodologies that have not only allowed us to implement databases, but which have also given us a way to communicate and discuss data architecture. So pervasive is this success that today data modeling is often taken for granted. It looks mature and rather static. Entry-level staff have to learn it, and experienced professionals may need to extend their skill sets into advanced areas, such as dimensional modeling. Yet even if there is a need for individual growth, there seems to be an unspoken assumption that the body of knowledge built up around data modeling is all that is needed to implement successful databases. Few people seem to question if this is really so. Is it possible to entertain the thought that data modeling does not completely encompass everything that is needed for successful database design? I would submit that there really are limits to what data modeling - at least in its present form - can achieve, and that in practice it cannot fully describe the architecture of any database.

Code Tables, Indicators and Nulls

One of the areas that has troubled me about data modeling involves the use of code tables. These pesky little tables, consisting of just a few columns and a few rows, are incredibly persistent in all database designs. They generally comprise anywhere from 20 to 50 percent of the tables in any data model. Their names are quite familiar to us: County Code, Product Category, Customer Type and so on. What is strange about these tables is that they usually have to be populated before a system or database goes live. Could they possibly have something to do with design?

Similarly, indicators - otherwise known as flags or switches - are pervasive types of attributes in data models. We argue over whether they should be physically implemented as "true" or "false" versus "yes" or "no," or some other representational scheme, but we rarely stop to think what they are doing in a database in the first place. And yet, like code tables, they are quite common in database designs, even if it seems a little odd that a characteristic of an entity should have a value like yes or no. Again, perhaps we are looking at something that partly embodies design rather than purely representing data.

Attributes that are indicators, or which are foreign keys from code tables, are often found in entities recognized as master data. Such entities include perennial favorites like Customer and Product, but in more specialized databases they are still found in entities that are readily classified as master data. This is not a universal rule, and both indicator and code attributes can be found in transactional data. Nevertheless, these two kinds of attributes are found most frequently in master data, and it is instructive to ask why this is. The answer seems to be, as we shall see below, that they are used to describe structures within database designs that cannot be represented by current data modeling techniques.

A separate issue, which at first appears quite unrelated, involves specifying if columns are null or not null in data models. Anybody with any experience of data modeling is likely to have come across attributes that must be null in certain circumstances, but must be not null in other circumstances. Data modeling tools do not seem to be built to handle this requirement: they ask if an attribute is potentially nullable or always not null. They cannot record if a column must be null under certain conditions but must be not null under other conditions. Once again, this is a hint that there are limits to how precisely data modeling can specify database design.

Logical versus Physical

Before we look at an example of the limits of data modeling, it is important to recognize a cultural issue in data modeling, and indeed IT as a whole. This is the great divide between the logical and physical perspectives of data management. On the one hand we have data modelers and data administrators who deal in design and related metadata, and are perceived as the builders, if not the owners, of data models. On the other hand we have database administrators and programmers who have to get databases up and running, and who look to those who traffic in more logical constructs for as much practical help as possible. Nevertheless, in my experience data modelers and data administrators nearly always see themselves as responsible solely for designs, and think that the way in which business users choose to put data into a database is entirely up to the business users. This protective attitude simplifies the life of a great number of IT professionals, but it may set limits on the level to which databases can be designed.

One problem is that if physical values that populate database columns somehow describe the structure of a database, then this structure is never going to be captured in a data model since a model never contains physical values - only the implemented database does. The logical/physical divide between models and databases will always stand in the way of such design structure being represented.

The World of Hidden Subtypes

Returning to the issues presented by code tables, indicators, and nulls, it is possible to see that database design consists of more than can be represented in a data model.

Suppose we have a manufacturing company that makes kitchen appliances and plumbing fixtures. The Product entity for this company can be represented by the model fragment shown in Figure 1.

 Figure 1: Data Model Fragment for Product Entity

Advertisement

Twitter
Facebook
LinkedIn
Login  |  My Account  |  White Papers  |  Web Seminars  |  Events |  Newsletters |  eBooks
FOLLOW US
Please note you must now log in with your email address and password.